cs.LG.xml

<rss version="2.0"><channel><title>Chat Arxiv cs.LG</title><link>https://github.com/qhduan/cn-chat-arxiv</link><description>This is arxiv RSS feed for cs.LG</description><item><title>MIM-Refiner&#26159;&#19968;&#31181;&#23545;&#27604;&#23398;&#20064;&#25552;&#21319;&#26041;&#27861;&#65292;&#36890;&#36807;&#21033;&#29992;MIM&#27169;&#22411;&#20013;&#30340;&#20013;&#38388;&#23618;&#34920;&#31034;&#21644;&#22810;&#20010;&#23545;&#27604;&#22836;&#65292;&#33021;&#22815;&#23558;MIM&#27169;&#22411;&#30340;&#29305;&#24449;&#20174;&#27425;&#20248;&#30340;&#29366;&#24577;&#25552;&#21319;&#21040;&#26368;&#20808;&#36827;&#30340;&#29366;&#24577;&#65292;&#24182;&#22312;ImageNet-1K&#25968;&#25454;&#38598;&#19978;&#21462;&#24471;&#20102;&#26032;&#30340;&#26368;&#20808;&#36827;&#32467;&#26524;&#12290;</title><link>https://arxiv.org/abs/2402.10093</link><description>&lt;p&gt;
MIM-Refiner&#65306;&#19968;&#31181;&#20174;&#20013;&#38388;&#39044;&#35757;&#32451;&#34920;&#31034;&#20013;&#33719;&#24471;&#23545;&#27604;&#23398;&#20064;&#25552;&#21319;&#30340;&#26041;&#27861;
&lt;/p&gt;
&lt;p&gt;
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
&lt;/p&gt;
&lt;p&gt;
https://arxiv.org/abs/2402.10093
&lt;/p&gt;
&lt;p&gt;
MIM-Refiner&#26159;&#19968;&#31181;&#23545;&#27604;&#23398;&#20064;&#25552;&#21319;&#26041;&#27861;&#65292;&#36890;&#36807;&#21033;&#29992;MIM&#27169;&#22411;&#20013;&#30340;&#20013;&#38388;&#23618;&#34920;&#31034;&#21644;&#22810;&#20010;&#23545;&#27604;&#22836;&#65292;&#33021;&#22815;&#23558;MIM&#27169;&#22411;&#30340;&#29305;&#24449;&#20174;&#27425;&#20248;&#30340;&#29366;&#24577;&#25552;&#21319;&#21040;&#26368;&#20808;&#36827;&#30340;&#29366;&#24577;&#65292;&#24182;&#22312;ImageNet-1K&#25968;&#25454;&#38598;&#19978;&#21462;&#24471;&#20102;&#26032;&#30340;&#26368;&#20808;&#36827;&#32467;&#26524;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#25105;&#20204;&#24341;&#20837;&#20102;MIM-Refiner&#65292;&#36825;&#26159;&#19968;&#31181;&#29992;&#20110;&#39044;&#35757;&#32451;MIM&#27169;&#22411;&#30340;&#23545;&#27604;&#23398;&#20064;&#25552;&#21319;&#26041;&#27861;&#12290;MIM-Refiner&#30340;&#21160;&#26426;&#22312;&#20110;MIM&#27169;&#22411;&#20013;&#30340;&#26368;&#20339;&#34920;&#31034;&#36890;&#24120;&#20301;&#20110;&#20013;&#38388;&#23618;&#12290;&#22240;&#27492;&#65292;MIM-Refiner&#21033;&#29992;&#36830;&#25509;&#21040;&#19981;&#21516;&#20013;&#38388;&#23618;&#30340;&#22810;&#20010;&#23545;&#27604;&#22836;&#12290;&#22312;&#27599;&#20010;&#22836;&#20013;&#65292;&#20462;&#25913;&#21518;&#30340;&#26368;&#36817;&#37051;&#30446;&#26631;&#24110;&#21161;&#26500;&#24314;&#30456;&#24212;&#30340;&#35821;&#20041;&#32858;&#31867;&#12290;&#27492;&#36807;&#31243;&#30701;&#32780;&#26377;&#25928;&#65292;&#22312;&#20960;&#20010;epochs&#20869;&#65292;&#25105;&#20204;&#23558;MIM&#27169;&#22411;&#30340;&#29305;&#24449;&#20174;&#27425;&#20248;&#30340;&#29366;&#24577;&#25552;&#21319;&#21040;&#26368;&#20808;&#36827;&#30340;&#29366;&#24577;&#12290;&#20351;&#29992;data2vec 2.0&#22312;ImageNet-1K&#19978;&#39044;&#35757;&#32451;&#30340;ViT-H&#32463;&#36807;&#25913;&#36827;&#21518;&#65292;&#22312;&#32447;&#24615;&#25506;&#27979;&#21644;&#20302;&#26679;&#26412;&#20998;&#31867;&#26041;&#38754;&#21462;&#24471;&#20102;&#26032;&#30340;&#26368;&#20808;&#36827;&#32467;&#26524;&#65288;&#20998;&#21035;&#20026;84.7%&#21644;64.2%&#65289;&#65292;&#36229;&#36807;&#20102;&#22312;ImageNet-1K&#19978;&#39044;&#35757;&#32451;&#30340;&#20854;&#20182;&#27169;&#22411;&#30340;&#34920;&#29616;&#12290;
&lt;/p&gt;
&lt;p&gt;
arXiv:2402.10093v1 Announce Type: cross  Abstract: We introduce MIM (Masked Image Modeling)-Refiner, a contrastive learning boost for pre-trained MIM models. The motivation behind MIM-Refiner is rooted in the insight that optimal representations within MIM models generally reside in intermediate layers. Accordingly, MIM-Refiner leverages multiple contrastive heads that are connected to diverse intermediate layers. In each head, a modified nearest neighbor objective helps to construct respective semantic clusters.   The refinement process is short but effective. Within a few epochs, we refine the features of MIM models from subpar to state-of-the-art, off-the-shelf features. Refining a ViT-H, pre-trained with data2vec 2.0 on ImageNet-1K, achieves new state-of-the-art results in linear probing (84.7%) and low-shot classification among models that are pre-trained on ImageNet-1K. In ImageNet-1K 1-shot classification, MIM-Refiner sets a new state-of-the-art of 64.2%, outperforming larger mo
&lt;/p&gt;</description></item><item><title>&#26412;&#30740;&#31350;&#25506;&#35752;&#20102;&#22312;&#36716;&#31227;&#23398;&#20064;&#29615;&#22659;&#20013;&#22823;&#22411;&#35821;&#35328;&#27169;&#22411;&#30340;&#23610;&#24230;&#34892;&#20026;&#65292;&#21457;&#29616;&#24494;&#35843;&#25968;&#25454;&#38598;&#30340;&#22823;&#23567;&#21644;&#39044;&#35757;&#32451;&#25968;&#25454;&#19982;&#19979;&#28216;&#25968;&#25454;&#30340;&#20998;&#24067;&#19968;&#33268;&#24615;&#23545;&#19979;&#28216;&#24615;&#33021;&#26377;&#26174;&#33879;&#24433;&#21709;&#12290;</title><link>https://arxiv.org/abs/2402.04177</link><description>&lt;p&gt;
&#22823;&#22411;&#35821;&#35328;&#27169;&#22411;&#30340;&#19979;&#28216;&#20219;&#21153;&#24615;&#33021;&#30340;&#23610;&#24230;&#24459;
&lt;/p&gt;
&lt;p&gt;
Scaling Laws for Downstream Task Performance of Large Language Models
&lt;/p&gt;
&lt;p&gt;
https://arxiv.org/abs/2402.04177
&lt;/p&gt;
&lt;p&gt;
&#26412;&#30740;&#31350;&#25506;&#35752;&#20102;&#22312;&#36716;&#31227;&#23398;&#20064;&#29615;&#22659;&#20013;&#22823;&#22411;&#35821;&#35328;&#27169;&#22411;&#30340;&#23610;&#24230;&#34892;&#20026;&#65292;&#21457;&#29616;&#24494;&#35843;&#25968;&#25454;&#38598;&#30340;&#22823;&#23567;&#21644;&#39044;&#35757;&#32451;&#25968;&#25454;&#19982;&#19979;&#28216;&#25968;&#25454;&#30340;&#20998;&#24067;&#19968;&#33268;&#24615;&#23545;&#19979;&#28216;&#24615;&#33021;&#26377;&#26174;&#33879;&#24433;&#21709;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#23610;&#24230;&#24459;&#25552;&#20379;&#20102;&#37325;&#35201;&#30340;&#35265;&#35299;&#65292;&#21487;&#20197;&#25351;&#23548;&#22823;&#22411;&#35821;&#35328;&#27169;&#22411;&#65288;LLM&#65289;&#30340;&#35774;&#35745;&#12290;&#29616;&#26377;&#30740;&#31350;&#20027;&#35201;&#38598;&#20013;&#22312;&#30740;&#31350;&#39044;&#35757;&#32451;&#65288;&#19978;&#28216;&#65289;&#25439;&#22833;&#30340;&#23610;&#24230;&#24459;&#12290;&#28982;&#32780;&#65292;&#22312;&#36716;&#31227;&#23398;&#20064;&#29615;&#22659;&#20013;&#65292;LLM&#20808;&#22312;&#26080;&#30417;&#30563;&#25968;&#25454;&#38598;&#19978;&#36827;&#34892;&#39044;&#35757;&#32451;&#65292;&#28982;&#21518;&#22312;&#19979;&#28216;&#20219;&#21153;&#19978;&#36827;&#34892;&#24494;&#35843;&#65292;&#25105;&#20204;&#36890;&#24120;&#20063;&#20851;&#24515;&#19979;&#28216;&#24615;&#33021;&#12290;&#22312;&#36825;&#39033;&#24037;&#20316;&#20013;&#65292;&#25105;&#20204;&#30740;&#31350;&#20102;&#22312;&#36716;&#31227;&#23398;&#20064;&#29615;&#22659;&#20013;&#30340;&#23610;&#24230;&#34892;&#20026;&#65292;&#20854;&#20013;LLM&#34987;&#24494;&#35843;&#29992;&#20110;&#26426;&#22120;&#32763;&#35793;&#20219;&#21153;&#12290;&#20855;&#20307;&#32780;&#35328;&#65292;&#25105;&#20204;&#30740;&#31350;&#20102;&#39044;&#35757;&#32451;&#25968;&#25454;&#30340;&#36873;&#25321;&#21644;&#22823;&#23567;&#23545;&#19979;&#28216;&#24615;&#33021;&#65288;&#32763;&#35793;&#36136;&#37327;&#65289;&#30340;&#24433;&#21709;&#65292;&#20351;&#29992;&#20102;&#20004;&#20010;&#35780;&#20215;&#25351;&#26631;&#65306;&#19979;&#28216;&#20132;&#21449;&#29109;&#21644;BLEU&#20998;&#25968;&#12290;&#25105;&#20204;&#30340;&#23454;&#39564;&#35777;&#26126;&#65292;&#24494;&#35843;&#25968;&#25454;&#38598;&#30340;&#22823;&#23567;&#21644;&#39044;&#35757;&#32451;&#25968;&#25454;&#19982;&#19979;&#28216;&#25968;&#25454;&#30340;&#20998;&#24067;&#19968;&#33268;&#24615;&#26174;&#33879;&#24433;&#21709;&#23610;&#24230;&#34892;&#20026;&#12290;&#22312;&#20805;&#20998;&#19968;&#33268;&#24615;&#24773;&#20917;&#19979;&#65292;&#19979;&#28216;&#20132;&#21449;&#29109;&#21644;BLEU&#20998;&#25968;&#37117;&#20250;&#36880;&#28176;&#25552;&#21319;&#12290;
&lt;/p&gt;
&lt;p&gt;
Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we study the scaling behavior in a transfer learning setting, where LLMs are finetuned for machine translation tasks. Specifically, we investigate how the choice of the pretraining data and its size affect downstream performance (translation quality) as judged by two metrics: downstream cross-entropy and BLEU score. Our experiments indicate that the size of the finetuning dataset and the distribution alignment between the pretraining and downstream data significantly influence the scaling behavior. With sufficient alignment, both downstream cross-entropy and BLEU score improve monotonically with 
&lt;/p&gt;</description></item><item><title>&#26412;&#25991;&#25552;&#20986;&#20102;&#19968;&#31181;&#26032;&#30340;&#26080;&#30417;&#30563;&#35780;&#20272;&#26041;&#27861;&#65292;&#21033;&#29992;&#21516;&#34892;&#35780;&#23457;&#26426;&#21046;&#22312;&#24320;&#25918;&#29615;&#22659;&#20013;&#34913;&#37327;LLMs&#12290;&#36890;&#36807;&#20026;&#27599;&#20010;LLM&#20998;&#37197;&#21487;&#23398;&#20064;&#30340;&#33021;&#21147;&#21442;&#25968;&#65292;&#20197;&#26368;&#22823;&#21270;&#21508;&#20010;LLM&#30340;&#33021;&#21147;&#21644;&#24471;&#20998;&#30340;&#19968;&#33268;&#24615;&#12290;&#32467;&#26524;&#34920;&#26126;&#65292;&#39640;&#23618;&#27425;&#30340;LLM&#33021;&#22815;&#26356;&#20934;&#30830;&#22320;&#35780;&#20272;&#20854;&#20182;&#27169;&#22411;&#30340;&#31572;&#26696;&#65292;&#24182;&#33021;&#22815;&#33719;&#24471;&#26356;&#39640;&#30340;&#21709;&#24212;&#24471;&#20998;&#12290;</title><link>https://arxiv.org/abs/2402.01830</link><description>&lt;p&gt;
LLM&#20013;&#30340;&#21516;&#34892;&#35780;&#23457;&#26041;&#27861;&#65306;&#24320;&#25918;&#29615;&#22659;&#19979;LLMs&#30340;&#33258;&#21160;&#35780;&#20272;&#26041;&#27861;
&lt;/p&gt;
&lt;p&gt;
Peer-review-in-LLMs: Automatic Evaluation Method for LLMs in Open-environment
&lt;/p&gt;
&lt;p&gt;
https://arxiv.org/abs/2402.01830
&lt;/p&gt;
&lt;p&gt;
&#26412;&#25991;&#25552;&#20986;&#20102;&#19968;&#31181;&#26032;&#30340;&#26080;&#30417;&#30563;&#35780;&#20272;&#26041;&#27861;&#65292;&#21033;&#29992;&#21516;&#34892;&#35780;&#23457;&#26426;&#21046;&#22312;&#24320;&#25918;&#29615;&#22659;&#20013;&#34913;&#37327;LLMs&#12290;&#36890;&#36807;&#20026;&#27599;&#20010;LLM&#20998;&#37197;&#21487;&#23398;&#20064;&#30340;&#33021;&#21147;&#21442;&#25968;&#65292;&#20197;&#26368;&#22823;&#21270;&#21508;&#20010;LLM&#30340;&#33021;&#21147;&#21644;&#24471;&#20998;&#30340;&#19968;&#33268;&#24615;&#12290;&#32467;&#26524;&#34920;&#26126;&#65292;&#39640;&#23618;&#27425;&#30340;LLM&#33021;&#22815;&#26356;&#20934;&#30830;&#22320;&#35780;&#20272;&#20854;&#20182;&#27169;&#22411;&#30340;&#31572;&#26696;&#65292;&#24182;&#33021;&#22815;&#33719;&#24471;&#26356;&#39640;&#30340;&#21709;&#24212;&#24471;&#20998;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#29616;&#26377;&#30340;&#22823;&#22411;&#35821;&#35328;&#27169;&#22411;&#65288;LLMs&#65289;&#35780;&#20272;&#26041;&#27861;&#36890;&#24120;&#38598;&#20013;&#20110;&#22312;&#19968;&#20123;&#26377;&#20154;&#24037;&#27880;&#37322;&#30340;&#23553;&#38381;&#29615;&#22659;&#21644;&#29305;&#23450;&#39046;&#22495;&#22522;&#20934;&#19978;&#27979;&#35797;&#24615;&#33021;&#12290;&#26412;&#25991;&#25506;&#32034;&#20102;&#19968;&#31181;&#26032;&#39062;&#30340;&#26080;&#30417;&#30563;&#35780;&#20272;&#26041;&#27861;&#65292;&#21033;&#29992;&#21516;&#34892;&#35780;&#23457;&#26426;&#21046;&#33258;&#21160;&#34913;&#37327;LLMs&#12290;&#22312;&#36825;&#20010;&#35774;&#32622;&#20013;&#65292;&#24320;&#28304;&#21644;&#38381;&#28304;&#30340;LLMs&#22788;&#20110;&#21516;&#19968;&#29615;&#22659;&#20013;&#65292;&#33021;&#22815;&#22238;&#31572;&#26410;&#26631;&#35760;&#30340;&#38382;&#39064;&#24182;&#20114;&#30456;&#35780;&#20272;&#65292;&#27599;&#20010;LLM&#30340;&#21709;&#24212;&#24471;&#20998;&#30001;&#20854;&#20182;&#21311;&#21517;&#30340;LLMs&#20849;&#21516;&#20915;&#23450;&#12290;&#20026;&#20102;&#33719;&#21462;&#36825;&#20123;&#27169;&#22411;&#20043;&#38388;&#30340;&#33021;&#21147;&#23618;&#27425;&#32467;&#26500;&#65292;&#25105;&#20204;&#20026;&#27599;&#20010;LLM&#20998;&#37197;&#19968;&#20010;&#21487;&#23398;&#20064;&#30340;&#33021;&#21147;&#21442;&#25968;&#26469;&#35843;&#25972;&#26368;&#32456;&#25490;&#24207;&#32467;&#26524;&#12290;&#25105;&#20204;&#23558;&#20854;&#24418;&#24335;&#21270;&#20026;&#19968;&#20010;&#21463;&#32422;&#26463;&#30340;&#20248;&#21270;&#38382;&#39064;&#65292;&#26088;&#22312;&#26368;&#22823;&#21270;&#27599;&#20010;LLM&#30340;&#33021;&#21147;&#21644;&#24471;&#20998;&#30340;&#19968;&#33268;&#24615;&#12290;&#32972;&#21518;&#30340;&#20851;&#38190;&#20551;&#35774;&#26159;&#39640;&#23618;&#27425;&#30340;LLM&#33021;&#22815;&#27604;&#20302;&#23618;&#27425;&#30340;LLM&#26356;&#20934;&#30830;&#22320;&#35780;&#20272;&#20854;&#20182;&#27169;&#22411;&#30340;&#31572;&#26696;&#65292;&#32780;&#39640;&#23618;&#27425;&#30340;LLM&#20063;&#21487;&#20197;&#36798;&#21040;&#36739;&#39640;&#30340;&#21709;&#24212;&#24471;&#20998;&#12290;
&lt;/p&gt;
&lt;p&gt;
Existing large language models (LLMs) evaluation methods typically focus on testing the performance on some closed-environment and domain-specific benchmarks with human annotations. In this paper, we explore a novel unsupervised evaluation direction, utilizing peer-review mechanisms to measure LLMs automatically. In this setting, both open-source and closed-source LLMs lie in the same environment, capable of answering unlabeled questions and evaluating each other, where each LLM's response score is jointly determined by other anonymous ones. To obtain the ability hierarchy among these models, we assign each LLM a learnable capability parameter to adjust the final ranking. We formalize it as a constrained optimization problem, intending to maximize the consistency of each LLM's capabilities and scores. The key assumption behind is that high-level LLM can evaluate others' answers more accurately than low-level ones, while higher-level LLM can also achieve higher response scores. Moreover
&lt;/p&gt;</description></item><item><title>&#25552;&#20986;&#20102;&#19968;&#31181;&#21517;&#20026;GraphFM&#30340;&#22270;&#22240;&#23376;&#20998;&#35299;&#26426;&#26041;&#27861;&#65292;&#36890;&#36807;&#22270;&#32467;&#26500;&#33258;&#28982;&#34920;&#31034;&#29305;&#24449;&#65292;&#24182;&#23558;FM&#30340;&#20132;&#20114;&#21151;&#33021;&#38598;&#25104;&#21040;GNN&#30340;&#29305;&#24449;&#32858;&#21512;&#31574;&#30053;&#20013;&#65292;&#33021;&#22815;&#27169;&#25311;&#20219;&#24847;&#38454;&#29305;&#24449;&#20132;&#20114;&#12290;</title><link>https://arxiv.org/abs/2105.11866</link><description>&lt;p&gt;
GraphFM&#65306;&#22270;&#22240;&#23376;&#20998;&#35299;&#26426;&#29992;&#20110;&#29305;&#24449;&#20132;&#20114;&#24314;&#27169;
&lt;/p&gt;
&lt;p&gt;
GraphFM: Graph Factorization Machines for Feature Interaction Modeling
&lt;/p&gt;
&lt;p&gt;
https://arxiv.org/abs/2105.11866
&lt;/p&gt;
&lt;p&gt;
&#25552;&#20986;&#20102;&#19968;&#31181;&#21517;&#20026;GraphFM&#30340;&#22270;&#22240;&#23376;&#20998;&#35299;&#26426;&#26041;&#27861;&#65292;&#36890;&#36807;&#22270;&#32467;&#26500;&#33258;&#28982;&#34920;&#31034;&#29305;&#24449;&#65292;&#24182;&#23558;FM&#30340;&#20132;&#20114;&#21151;&#33021;&#38598;&#25104;&#21040;GNN&#30340;&#29305;&#24449;&#32858;&#21512;&#31574;&#30053;&#20013;&#65292;&#33021;&#22815;&#27169;&#25311;&#20219;&#24847;&#38454;&#29305;&#24449;&#20132;&#20114;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#22240;&#23376;&#20998;&#35299;&#26426;&#65288;FM&#65289;&#26159;&#22788;&#29702;&#39640;&#32500;&#31232;&#30095;&#25968;&#25454;&#26102;&#24314;&#27169;&#25104;&#23545;&#65288;&#20108;&#38454;&#65289;&#29305;&#24449;&#20132;&#20114;&#30340;&#19968;&#31181;&#24120;&#35265;&#26041;&#27861;&#12290;&#28982;&#32780;&#65292;&#19968;&#26041;&#38754;&#65292;FM&#26410;&#33021;&#25429;&#25417;&#21040;&#39640;&#38454;&#29305;&#24449;&#20132;&#20114;&#65292;&#21463;&#21040;&#32452;&#21512;&#25193;&#23637;&#30340;&#24433;&#21709;&#12290;&#21478;&#19968;&#26041;&#38754;&#65292;&#32771;&#34385;&#27599;&#23545;&#29305;&#24449;&#20043;&#38388;&#30340;&#20132;&#20114;&#21487;&#33021;&#20250;&#24341;&#20837;&#22122;&#22768;&#24182;&#38477;&#20302;&#39044;&#27979;&#20934;&#30830;&#24615;&#12290;&#20026;&#20102;&#35299;&#20915;&#36825;&#20123;&#38382;&#39064;&#65292;&#25105;&#20204;&#25552;&#20986;&#20102;&#19968;&#31181;&#26032;&#26041;&#27861;&#65292;&#31216;&#20026;Graph Factorization Machine&#65288;GraphFM&#65289;&#65292;&#36890;&#36807;&#23558;&#29305;&#24449;&#33258;&#28982;&#34920;&#31034;&#25104;&#22270;&#32467;&#26500;&#12290;&#20855;&#20307;&#32780;&#35328;&#65292;&#25105;&#20204;&#35774;&#35745;&#20102;&#19968;&#31181;&#26426;&#21046;&#26469;&#36873;&#25321;&#26377;&#30410;&#30340;&#29305;&#24449;&#20132;&#20114;&#65292;&#24182;&#23558;&#20854;&#24418;&#24335;&#21270;&#20026;&#29305;&#24449;&#20043;&#38388;&#30340;&#36793;&#12290;&#28982;&#21518;&#65292;&#25152;&#25552;&#20986;&#30340;&#27169;&#22411;&#23558;FM&#30340;&#20132;&#20114;&#21151;&#33021;&#25972;&#21512;&#21040;&#22270;&#31070;&#32463;&#32593;&#32476;&#65288;GNN&#65289;&#30340;&#29305;&#24449;&#32858;&#21512;&#31574;&#30053;&#20013;&#65292;&#36890;&#36807;&#22534;&#21472;&#23618;&#26469;&#27169;&#25311;&#22270;&#32467;&#26500;&#29305;&#24449;&#19978;&#30340;&#20219;&#24847;&#38454;&#29305;&#24449;&#20132;&#20114;&#12290;
&lt;/p&gt;
&lt;p&gt;
arXiv:2105.11866v4 Announce Type: replace-cross  Abstract: Factorization machine (FM) is a prevalent approach to modeling pairwise (second-order) feature interactions when dealing with high-dimensional sparse data. However, on the one hand, FM fails to capture higher-order feature interactions suffering from combinatorial expansion. On the other hand, taking into account interactions between every pair of features may introduce noise and degrade prediction accuracy. To solve the problems, we propose a novel approach, Graph Factorization Machine (GraphFM), by naturally representing features in the graph structure. In particular, we design a mechanism to select the beneficial feature interactions and formulate them as edges between features. Then the proposed model, which integrates the interaction function of FM into the feature aggregation strategy of Graph Neural Network (GNN), can model arbitrary-order feature interactions on the graph-structured features by stacking layers. Experime
&lt;/p&gt;</description></item><item><title>&#25105;&#20204;&#25552;&#20986;&#20102;&#19968;&#31181;&#21487;&#36870;&#35299;&#20915;&#38750;&#35268;&#21017;&#37319;&#26679;&#26102;&#38388;&#24207;&#21015;&#30340;&#31070;&#32463;&#24494;&#20998;&#26041;&#31243;&#20998;&#26512;&#26041;&#27861;&#65292;&#36890;&#36807;&#24341;&#20837;&#31070;&#32463;&#27969;&#30340;&#27010;&#24565;&#65292;&#25105;&#20204;&#30340;&#26041;&#27861;&#26082;&#20445;&#35777;&#20102;&#21487;&#36870;&#24615;&#21448;&#38477;&#20302;&#20102;&#35745;&#31639;&#36127;&#25285;&#65292;&#24182;&#19988;&#22312;&#20998;&#31867;&#21644;&#25554;&#20540;&#20219;&#21153;&#20013;&#34920;&#29616;&#20986;&#20102;&#20248;&#24322;&#30340;&#24615;&#33021;&#12290;</title><link>http://arxiv.org/abs/2401.04979</link><description>&lt;p&gt;
&#21487;&#36870;&#35299;&#20915;&#38750;&#35268;&#21017;&#37319;&#26679;&#26102;&#38388;&#24207;&#21015;&#30340;&#31070;&#32463;&#24494;&#20998;&#26041;&#31243;&#20998;&#26512;&#26041;&#27861;
&lt;/p&gt;
&lt;p&gt;
Invertible Solution of Neural Differential Equations for Analysis of Irregularly-Sampled Time Series. (arXiv:2401.04979v1 [cs.LG])
&lt;/p&gt;
&lt;p&gt;
http://arxiv.org/abs/2401.04979
&lt;/p&gt;
&lt;p&gt;
&#25105;&#20204;&#25552;&#20986;&#20102;&#19968;&#31181;&#21487;&#36870;&#35299;&#20915;&#38750;&#35268;&#21017;&#37319;&#26679;&#26102;&#38388;&#24207;&#21015;&#30340;&#31070;&#32463;&#24494;&#20998;&#26041;&#31243;&#20998;&#26512;&#26041;&#27861;&#65292;&#36890;&#36807;&#24341;&#20837;&#31070;&#32463;&#27969;&#30340;&#27010;&#24565;&#65292;&#25105;&#20204;&#30340;&#26041;&#27861;&#26082;&#20445;&#35777;&#20102;&#21487;&#36870;&#24615;&#21448;&#38477;&#20302;&#20102;&#35745;&#31639;&#36127;&#25285;&#65292;&#24182;&#19988;&#22312;&#20998;&#31867;&#21644;&#25554;&#20540;&#20219;&#21153;&#20013;&#34920;&#29616;&#20986;&#20102;&#20248;&#24322;&#30340;&#24615;&#33021;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#20026;&#20102;&#22788;&#29702;&#38750;&#35268;&#21017;&#21644;&#19981;&#23436;&#25972;&#30340;&#26102;&#38388;&#24207;&#21015;&#25968;&#25454;&#30340;&#22797;&#26434;&#24615;&#65292;&#25105;&#20204;&#25552;&#20986;&#20102;&#19968;&#31181;&#22522;&#20110;&#31070;&#32463;&#24494;&#20998;&#26041;&#31243;&#65288;NDE&#65289;&#30340;&#21487;&#36870;&#35299;&#20915;&#26041;&#26696;&#12290;&#34429;&#28982;&#22522;&#20110;NDE&#30340;&#26041;&#27861;&#26159;&#20998;&#26512;&#38750;&#35268;&#21017;&#37319;&#26679;&#26102;&#38388;&#24207;&#21015;&#30340;&#19968;&#31181;&#24378;&#22823;&#26041;&#27861;&#65292;&#20294;&#23427;&#20204;&#36890;&#24120;&#19981;&#33021;&#20445;&#35777;&#22312;&#20854;&#26631;&#20934;&#24418;&#24335;&#19979;&#36827;&#34892;&#21487;&#36870;&#21464;&#25442;&#12290;&#25105;&#20204;&#30340;&#26041;&#27861;&#24314;&#35758;&#20351;&#29992;&#20855;&#26377;&#31070;&#32463;&#27969;&#30340;&#31070;&#32463;&#25511;&#21046;&#24494;&#20998;&#26041;&#31243;&#65288;Neural CDEs&#65289;&#30340;&#21464;&#31181;&#65292;&#35813;&#26041;&#27861;&#22312;&#20445;&#25345;&#36739;&#20302;&#30340;&#35745;&#31639;&#36127;&#25285;&#30340;&#21516;&#26102;&#30830;&#20445;&#20102;&#21487;&#36870;&#24615;&#12290;&#27492;&#22806;&#65292;&#23427;&#36824;&#21487;&#20197;&#35757;&#32451;&#21452;&#37325;&#28508;&#22312;&#31354;&#38388;&#65292;&#22686;&#24378;&#20102;&#23545;&#21160;&#24577;&#26102;&#38388;&#21160;&#21147;&#23398;&#30340;&#24314;&#27169;&#33021;&#21147;&#12290;&#25105;&#20204;&#30340;&#30740;&#31350;&#25552;&#20986;&#20102;&#19968;&#20010;&#20808;&#36827;&#30340;&#26694;&#26550;&#65292;&#22312;&#20998;&#31867;&#21644;&#25554;&#20540;&#20219;&#21153;&#20013;&#37117;&#34920;&#29616;&#20986;&#33394;&#12290;&#25105;&#20204;&#26041;&#27861;&#30340;&#26680;&#24515;&#26159;&#19968;&#20010;&#32463;&#36807;&#31934;&#24515;&#35774;&#35745;&#30340;&#22686;&#24378;&#22411;&#21452;&#37325;&#28508;&#22312;&#29366;&#24577;&#26550;&#26500;&#65292;&#29992;&#20110;&#22312;&#21508;&#31181;&#26102;&#38388;&#24207;&#21015;&#20219;&#21153;&#20013;&#25552;&#39640;&#31934;&#24230;&#12290;&#23454;&#35777;&#20998;&#26512;&#34920;&#26126;&#65292;&#25105;&#20204;&#30340;&#26041;&#27861;&#26126;&#26174;&#20248;&#20110;&#29616;&#26377;&#27169;&#22411;&#12290;
&lt;/p&gt;
&lt;p&gt;
To handle the complexities of irregular and incomplete time series data, we propose an invertible solution of Neural Differential Equations (NDE)-based method. While NDE-based methods are a powerful method for analyzing irregularly-sampled time series, they typically do not guarantee reversible transformations in their standard form. Our method suggests the variation of Neural Controlled Differential Equations (Neural CDEs) with Neural Flow, which ensures invertibility while maintaining a lower computational burden. Additionally, it enables the training of a dual latent space, enhancing the modeling of dynamic temporal dynamics. Our research presents an advanced framework that excels in both classification and interpolation tasks. At the core of our approach is an enhanced dual latent states architecture, carefully designed for high precision across various time series tasks. Empirical analysis demonstrates that our method significantly outperforms existing models. This work significan
&lt;/p&gt;</description></item><item><title>&#25552;&#20986;&#20102;&#19968;&#31181;&#21517;&#20026; GND-Nets &#30340;&#22270;&#31070;&#32463;&#32593;&#32476;&#65292;&#21033;&#29992;&#27973;&#23618;&#32593;&#32476;&#21644;&#23616;&#37096;&#12289;&#20840;&#23616;&#37051;&#22495;&#20449;&#24687;&#26469;&#35299;&#20915;&#22270;&#21322;&#30417;&#30563;&#23398;&#20064;&#20013;&#30340;&#36807;&#24230;&#24179;&#28369;&#21644;&#27424;&#24179;&#28369;&#38382;&#39064;&#12290;</title><link>http://arxiv.org/abs/2201.09698</link><description>&lt;p&gt;
&#22270;&#31070;&#32463;&#25193;&#25955;&#32593;&#32476;&#29992;&#20110;&#21322;&#30417;&#30563;&#23398;&#20064;
&lt;/p&gt;
&lt;p&gt;
Graph Neural Diffusion Networks for Semi-supervised Learning. (arXiv:2201.09698v2 [cs.LG] UPDATED)
&lt;/p&gt;
&lt;p&gt;
http://arxiv.org/abs/2201.09698
&lt;/p&gt;
&lt;p&gt;
&#25552;&#20986;&#20102;&#19968;&#31181;&#21517;&#20026; GND-Nets &#30340;&#22270;&#31070;&#32463;&#32593;&#32476;&#65292;&#21033;&#29992;&#27973;&#23618;&#32593;&#32476;&#21644;&#23616;&#37096;&#12289;&#20840;&#23616;&#37051;&#22495;&#20449;&#24687;&#26469;&#35299;&#20915;&#22270;&#21322;&#30417;&#30563;&#23398;&#20064;&#20013;&#30340;&#36807;&#24230;&#24179;&#28369;&#21644;&#27424;&#24179;&#28369;&#38382;&#39064;&#12290;
&lt;/p&gt;
&lt;p&gt;

&lt;/p&gt;
&lt;p&gt;
&#22270;&#21367;&#31215;&#32593;&#32476; (GCN) &#26159;&#29992;&#20110;&#22522;&#20110;&#22270;&#30340;&#21322;&#30417;&#30563;&#23398;&#20064;&#30340;&#20808;&#39537;&#27169;&#22411;&#12290;&#28982;&#32780;&#65292;GCN &#22312;&#26631;&#35760;&#31232;&#30095;&#30340;&#22270;&#19978;&#34920;&#29616;&#19981;&#20339;&#12290;&#20854;&#20004;&#23618;&#29256;&#26412;&#19981;&#33021;&#26377;&#25928;&#22320;&#23558;&#26631;&#31614;&#20449;&#24687;&#20256;&#25773;&#21040;&#25972;&#20010;&#22270;&#32467;&#26500;&#65288;&#21363;&#27424;&#24179;&#28369;&#38382;&#39064;&#65289;&#65292;&#32780;&#20854;&#28145;&#23618;&#29256;&#26412;&#21017;&#36807;&#24230;&#24179;&#28369;&#19988;&#38590;&#20197;&#35757;&#32451;&#65288;&#21363;&#36807;&#24230;&#24179;&#28369;&#38382;&#39064;&#65289;&#12290;&#20026;&#20102;&#35299;&#20915;&#36825;&#20004;&#20010;&#38382;&#39064;&#65292;&#25105;&#20204;&#25552;&#20986;&#20102;&#19968;&#31181;&#26032;&#30340;&#22270;&#31070;&#32463;&#32593;&#32476;&#65292;&#31216;&#20026; GND-Nets&#65288;&#22270;&#31070;&#32463;&#25193;&#25955;&#32593;&#32476;&#65289;&#65292;&#23427;&#22312;&#21333;&#23618;&#20013;&#21033;&#29992;&#20102;&#39030;&#28857;&#30340;&#23616;&#37096;&#21644;&#20840;&#23616;&#37051;&#22495;&#20449;&#24687;&#12290;&#21033;&#29992;&#27973;&#23618;&#32593;&#32476;&#21487;&#20197;&#32531;&#35299;&#36807;&#24230;&#24179;&#28369;&#38382;&#39064;&#65292;&#32780;&#21033;&#29992;&#23616;&#37096;&#21644;&#20840;&#23616;&#37051;&#22495;&#20449;&#24687;&#21487;&#20197;&#32531;&#35299;&#27424;&#24179;&#28369;&#38382;&#39064;&#12290;&#39030;&#28857;&#30340;&#23616;&#37096;&#21644;&#20840;&#23616;&#37051;&#22495;&#20449;&#24687;&#30340;&#21033;&#29992;&#26159;&#36890;&#36807;&#19968;&#31181;&#31216;&#20026;&#31070;&#32463;&#25193;&#25955;&#30340;&#26032;&#22270;&#25193;&#25955;&#26041;&#27861;&#23454;&#29616;&#30340;&#65292;&#35813;&#26041;&#27861;&#23558;&#31070;&#32463;&#32593;&#32476;&#34701;&#20837;&#20256;&#32479;&#30340;&#32447;&#24615;&#21644;&#38750;&#32447;&#24615;&#22270;&#25193;&#25955;&#20013;&#12290;
&lt;/p&gt;
&lt;p&gt;
Graph Convolutional Networks (GCN) is a pioneering model for graph-based semi-supervised learning. However, GCN does not perform well on sparsely-labeled graphs. Its two-layer version cannot effectively propagate the label information to the whole graph structure (i.e., the under-smoothing problem) while its deep version over-smoothens and is hard to train (i.e., the over-smoothing problem). To solve these two issues, we propose a new graph neural network called GND-Nets (for Graph Neural Diffusion Networks) that exploits the local and global neighborhood information of a vertex in a single layer. Exploiting the shallow network mitigates the over-smoothing problem while exploiting the local and global neighborhood information mitigates the under-smoothing problem. The utilization of the local and global neighborhood information of a vertex is achieved by a new graph diffusion method called neural diffusions, which integrate neural networks into the conventional linear and nonlinear gra
&lt;/p&gt;</description></item></channel></rss>