Latent Relation Representations for Universal Schemas SebastianRiedel LiminYao,AndrewMcCallum DepartmentofComputerScience DepartmentofComputerScience 3 1 UniversityCollegeLondon UniversityofMassachusettsatAmherst 0 [email protected] {lmyao,mccallum}@cs.umass.edu 2 n a J 8 2 1 Introduction ] G L Supervised relation extraction uses a pre-deﬁned schema of relation types (such as born-in or . employed-by). This approach requires labeling textual relations, a time-consuming and difﬁcult s c process. Thishasledtosigniﬁcantinterestindistantly-supervisedlearning. Hereonealignsexist- [ ingdatabaserecordswiththesentencesinwhichtheserecordshavebeen“rendered”,andfromthis labelingonecan traina machinelearningsystem asbefore[1, 2]. However,thismethodrelieson 2 theavailabilityofalargedatabasethathasthedesiredschema. v 3 The need for pre-existing databases can be avoided by not having any ﬁxed schema. This is the 9 approachtakenbyOpenIE[3]. Heresurfacepatternsbetween mentionsofconceptsserveasrela- 2 tions. Thisapproachrequiresnosupervisionandhastremendousﬂexibility,butlackstheabilityto 4 generalize. Forexample,OpenIEmayﬁndFERGUSON–historian-at–HARVARDbutdoesnotknow . 1 FERGUSON–is-a-professor-at–HARVARD. 0 3 Oneway to gaingeneralizationisto clustertextualsurface formsthathavesimilarmeaning[4, 5, 1 6,7]. Whiletheclustersdiscoveredbyallthesemethodsusuallycontainsemanticallyrelateditems, : closer inspection invariably shows that they do not provide reliable implicature. For example, a v cluster may include historian-at, professor-at, scientist-at, worked-at. However, scientist-at does i X notnecessarilyimplyprofessor-at,andworked-atcertainlydoesnotimplyscientist-at. Infact,we r contendthatanyrelationalschemawouldinherentlybebrittleandill-deﬁned––havingambiguities, a problematicboundarycases,andincompleteness. Inresponsetothisproblem,wepresentanewapproach: implicaturewithuniversalschemas. Here we embrace the diversity and ambiguity of originalinputs. This is accomplished by deﬁning our schematobetheunionofallsourceschemas: originalinputforms,e.g. variantsofsurfacepatterns similarly to OpenIE, as well as relations in the schemas of pre-existingstructureddatabases. But unlikeOpenIE,welearnasymmetricimplicatureamongrelationsandentitytypes. Thisallowsus to probabilistically“ﬁll in” inferredunobservedentity-entityrelationsin thisunion. Forexample, afterobservingFERGUSON–historian-at–HARVARD,oursysteminfersthatFERGUSON–professor- at–HARVARD,butnotviceversa. At the heart of our approach is the hypothesis that we should concentrate on predicting source data––arelativelywelldeﬁnedtaskthatcanbeevaluatedandoptimized––asopposedtomodeling semanticequivalence,whichwebelievewillalwaysbeillusive. Toreasonwithauniversalschema,welearnlatentfeaturerepresentationsofrelations,tuplesanden- tities. Theseact,throughdotproducts,asnaturalparametersofalog-linearmodelfortheprobability thatagivenrelationholdsforagiventuple.Weshowexperimentallythatthisapproachsigniﬁcantly outperformsa comparable baseline without latent features, and the current state-of-the-artdistant supervisionmethod. 1 2 Model We use R to denotethe set of relationswe seek to predict(such as works-written in Freebase, or the X–heads–Y pattern), and T to denote the set of input tuples. For simplicity we assume each relationtobebinary. Givenarelation r ∈ Randatuplet ∈ T thepairhr,tiisafact,orrelation instance. Theinputtoourmodelisasetofobservedfacts,andtheobservedfactsforagiventuple :={hr,ti∈}. t Ourgoalisamodelthatcanestimate,foragivenrelationr(suchasX–historian-at–Y)andagiven tuple t (such as <FERGUSON,HARVARD>) a score cr,t forthe fact hr,ti. This matrix completion problem is related to collaborative ﬁltering. We can think of each tuple as a customer, and each relationasaproduct.Ourgoalistopredicthowthetupleratestherelation(rating0=false,rating1 =true),basedonobservedratingsin. Weinterpretc astheprobabilityp(y =1)wherey is r,t r,t r,t abinaryrandomvariablethatistrueiffhr,tiholds.Tothisendweintroduceaseriesofexponential familymodelsinspiredbygeneralizedPCA[8],aprobabilisticgeneralizationofPrincipleCompo- nentAnalysis. These modelswill estimate the conﬁdencein hr,ti using a naturalparameter θ r,t andthelogisticfunction:c :=p(y |θ ):= 1 . r,t r,t r,t 1+exp(−θr,t) Wefollow[9]andusearankingbasedobjectivefunctiontoestimateparametersofourmodels. Latent Feature Model One way to deﬁne θ is through a latent feature model F. We measure r,t compatibilitybetweenrelationrandtupletasadotproductoftwolatentfeaturerepresentationsof sizeKF: a forrelation r,andv fortuplet. ThisgivesθF := PKFa v andcorrespondsto r t r,t k r,k t,k theoriginalgeneralizedPCAthatlearnsalow-rankfactorizationofΘ=(θ ). r,t NeighborhoodModel We caninterpolatetheconﬁdenceforagiventupleandrelationbasedon thetruenessofothersimilarrelationsforthesametuple.InCollaborativeFilteringthisisreferredas aneighborhood-basedapproach[10]. WeimplementaneighborhoodmodelNviaasetofweights wr,r′, whereeachcorrespondsto a directedassociationstrength betweenrelationsr andr′. Sum- mingtheseupgivesθrN,t :=Pr′∈t\{r}wr,r′.1 Entity Model Relations have selectional preferences: they allow only certain types in their ar- gument slots. To capture this observation, we learn a latent entity representation from data. For each entity e we introduce a latent feature vector t ∈ Rl. In addition, for each relation r and e argumentslot i we introduce a feature vector d . Measuring compatibility of an entity tuple and i relationamountstosummingupthecompatibilitiesbetweeneachargumentslotrepresentationand thecorrespondingentityrepresentation: θE :=Parity(r)PKEd t . r,t i=1 k i,k ti,k Combined Models In practice all the above models can capture important aspects of the data. Hencewealsousevariouscombinations,suchasθN,F,E :=θN +θF +θE . r,t r,t r,t r,t 3 Experiments Doesreasoningjointlyacrossa universalschemahelpto improveovermoreisolated approaches? Inthefollowingweseektoanswerthisquestionempirically. Data Ourexperimentalsetupisroughlyequivalentto previouswork[2], andhencewe omitde- tails. To summarize, we consider each pair ht ,t i of Freebase entities that appear together in a 1 2 corpus. Itssetofobservedfacts correspondto: Extractedsurfacepatterns(inourcase lexicalized t dependencypaths)between mentionsof t and t , andthe relationsof t and t in Freebase. We 1 2 1 2 divideallourtuplesintoapproximately200ktrainingtuples,and200ktesttuples. Thetotalnumber ofrelations(patternsandfromFreebase)isapproximately4k. 1Notice that theneighborhood model amounts toa collectionof local log-linear classiﬁers, one foreach relationrwithweightswr. 2 Predicting Freebase and Surface Pattern Relations For evaluation we use two collections of relations:Freebaserelationsandsurfacepatterns.Ineithercasewecomparethecompetingsystems withrespecttotheirrankedresultsforeachrelationinthe collection. Our ﬁrst baseline is MI09, a distantly supervised classiﬁer based on the work of [1]. We also compareagainstYA11,aversionofMI09thatusespreprocessedpatternclusterfeaturesaccording to[7]. Thethirdbaselineis SU12,the state-of-the-artMulti-InstanceMulti-Labelsystemby[11]. Theremainingsystemsareourneighborhoodmodel(N),thefactorizedmodel(F),theircombination (NF)andthecombinedmodelwithalatententityrepresentation(NFE). Theresultsintermsofmeanaverageprecision(withrespecttopooledresultsfromeachsystem)are inthetablebelow: Relation # MI09 YA11 SU12 N F NF NFE TotalFreebase 334 0.48 0.52 0.57 0.52 0.66 0.67 0.69 TotalPattern 329 0.28 0.56 0.50 0.46 ForFreebaserelations,wecanseethataddingpatternclusterfeatures(andhenceincorporatingmore data) helps YA11 to improve over MI09. Likewise, we see that the factorized model F improves over N, again learning from unlabeled data. This improvementis bigger than the corresponding changebetweenMI09and YA11, possiblyindicatingthatourlatentrepresentationsare optimized directlytowardsimprovingpredictionperformance.Ourbestmodel,thecombinationofN,FandE, outperformsallothermodelsintermsoftotalMAP,indicatingthepowerofselectionalpreferences learnedfromdata. MI09,YA11andSU12aredesignedtopredictstructuredrelations,andsoweomitthemforresults onsurfacepatterns. Lookatourmodelsforpredictingtuplesofsurfacepatterns. Weagainseethat learning a latent representation (F, NF and NFE models) from additional data helps substantially overthenon-latentNmodel. Allourmodelsarefasttotrain. Theslowestmodeltrainsinjust30minutes. Bycontrast,training thetopicmodelinYA11alonetakes4hours. TrainingSU12takestwohours(onlessdata). Also noticethatourmodelsnotonlylearntopredictFreebaserelations,butalsoapproximately4ksurface patternrelations. 4 Conclusion Werepresentrelationsusinguniversalschemas. Suchschemascontainsurfacepatternsasrelations, aswellasrelationsfromstructuredsources. Wecanpredict missingtuplesforsurfacepatternrela- tionsandstructuredschemarelations.Weshowthisexperimentallybycontrastingaseriesofpopular weaklysupervisedmodelstoourcollaborativeﬁlteringmodelsthatlearnlatentfeaturerepresenta- tions across surface patterns and structured relations. Moreover, our models are computationally efﬁcient,requiringlesstimethancomparablemethods,whilelearningmorerelations. Reasoningwithuniversalschemasisnotmerelyatoolforinformationextraction. Itcanalsoserve asaframeworkforvariousdataintegrationtasks,forexample,schemamatching.Infutureworkwe alsoplantointegrateuniversalentitytypesandattributesintothemodel. References [1] Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. Distant supervision forrelation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual MeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcess- ing of the AFNLP (ACL ’09), pages 1003–1011.Association for ComputationalLinguistics, 2009. [2] SebastianRiedel,LiminYao,andAndrewMcCallum. Modelingrelationsandtheirmentions withoutlabeled text. In Proceedingsofthe EuropeanConferenceonMachineLearningand KnowledgeDiscoveryinDatabases(ECMLPKDD’10),2010. [3] Oren Etzioni, Michele Banko, Stephen Soderland, and Daniel S. Weld. Open information extractionfromtheweb. Commun.ACM,51(12):68–74,2008. 3 [4] DekangLinandPatrickPantel. DIRT-discoveryofinferencerulesfromtext. InKnowledge DiscoveryandDataMining,pages323–328,2001. [5] Patrick Pantel, Rahul Bhagat, BonaventuraCoppola, TimothyChklovski, and EduardHovy. ISP:LearningInferentialSelectionalPreferences. InProceedingsofNAACLHLT,2007. [6] AlexanderYatesandOrenEtzioni. Unsupervisedmethodsfordeterminingobjectandrelation synonymsontheweb. JournalofArtiﬁcialIntelligenceResearch,34:255–296,2009. [7] Limin Yao, Aria Haghighi, Sebastian Riedel, and Andrew McCallum. Structured relation discoveryusinggenerativemodels. In ProceedingsoftheConferenceonEmpiricalmethods innaturallanguageprocessing(EMNLP’11),July2011. [8] Michael Collins, Sanjoy Dasgupta, and Robert E. Schapire. A generalization of principal componentanalysistotheexponentialfamily. InProceedingsofNIPS,2001. [9] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. Bpr: Bayesianpersonalizedrankingfromimplicitfeedback. InProceedingsofUAI,2009. [10] Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborativeﬁltering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discoveryanddatamining,KDD’08,pages426–434,NewYork,NY,USA,2008.ACM. [11] Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D. Manning. Multi- instancemulti-labellearningforrelationextraction. InProceedingsofEMNLP-CoNLL,2012. 4

Latent Relation Representations for Universal Schemas PDF

0.06 MB

English

by Sebastian Riedel

#additional_collections #journals #arxiv

Checking for file health...

Preview Latent Relation Representations for Universal Schemas

Latent Relation Representations for Universal Schemas SebastianRiedel LiminYao,AndrewMcCallum DepartmentofComputerScience DepartmentofComputerScience 3 1 UniversityCollegeLondon UniversityofMassachusettsatAmherst 0 [email protected] {lmyao,mccallum}@cs.umass.edu 2 n a J 8 2 1 Introduction ] G L Supervised relation extraction uses a pre-deﬁned schema of relation types (such as born-in or . employed-by). This approach requires labeling textual relations, a time-consuming and difﬁcult s c process. Thishasledtosigniﬁcantinterestindistantly-supervisedlearning. Hereonealignsexist- [ ingdatabaserecordswiththesentencesinwhichtheserecordshavebeen“rendered”,andfromthis labelingonecan traina machinelearningsystem asbefore[1, 2]. However,thismethodrelieson 2 theavailabilityofalargedatabasethathasthedesiredschema. v 3 The need for pre-existing databases can be avoided by not having any ﬁxed schema. This is the 9 approachtakenbyOpenIE[3]. Heresurfacepatternsbetween mentionsofconceptsserveasrela- 2 tions. Thisapproachrequiresnosupervisionandhastremendousﬂexibility,butlackstheabilityto 4 generalize. Forexample,OpenIEmayﬁndFERGUSON–historian-at–HARVARDbutdoesnotknow . 1 FERGUSON–is-a-professor-at–HARVARD. 0 3 Oneway to gaingeneralizationisto clustertextualsurface formsthathavesimilarmeaning[4, 5, 1 6,7]. Whiletheclustersdiscoveredbyallthesemethodsusuallycontainsemanticallyrelateditems, : closer inspection invariably shows that they do not provide reliable implicature. For example, a v cluster may include historian-at, professor-at, scientist-at, worked-at. However, scientist-at does i X notnecessarilyimplyprofessor-at,andworked-atcertainlydoesnotimplyscientist-at. Infact,we r contendthatanyrelationalschemawouldinherentlybebrittleandill-deﬁned––havingambiguities, a problematicboundarycases,andincompleteness. Inresponsetothisproblem,wepresentanewapproach: implicaturewithuniversalschemas. Here we embrace the diversity and ambiguity of originalinputs. This is accomplished by deﬁning our schematobetheunionofallsourceschemas: originalinputforms,e.g. variantsofsurfacepatterns similarly to OpenIE, as well as relations in the schemas of pre-existingstructureddatabases. But unlikeOpenIE,welearnasymmetricimplicatureamongrelationsandentitytypes. Thisallowsus to probabilistically“ﬁll in” inferredunobservedentity-entityrelationsin thisunion. Forexample, afterobservingFERGUSON–historian-at–HARVARD,oursysteminfersthatFERGUSON–professor- at–HARVARD,butnotviceversa. At the heart of our approach is the hypothesis that we should concentrate on predicting source data––arelativelywelldeﬁnedtaskthatcanbeevaluatedandoptimized––asopposedtomodeling semanticequivalence,whichwebelievewillalwaysbeillusive. Toreasonwithauniversalschema,welearnlatentfeaturerepresentationsofrelations,tuplesanden- tities. Theseact,throughdotproducts,asnaturalparametersofalog-linearmodelfortheprobability thatagivenrelationholdsforagiventuple.Weshowexperimentallythatthisapproachsigniﬁcantly outperformsa comparable baseline without latent features, and the current state-of-the-artdistant supervisionmethod. 1 2 Model We use R to denotethe set of relationswe seek to predict(such as works-written in Freebase, or the X–heads–Y pattern), and T to denote the set of input tuples. For simplicity we assume each relationtobebinary. Givenarelation r ∈ Randatuplet ∈ T thepairhr,tiisafact,orrelation instance. Theinputtoourmodelisasetofobservedfacts,andtheobservedfactsforagiventuple :={hr,ti∈}. t Ourgoalisamodelthatcanestimate,foragivenrelationr(suchasX–historian-at–Y)andagiven tuple t (such as <FERGUSON,HARVARD>) a score cr,t forthe fact hr,ti. This matrix completion problem is related to collaborative ﬁltering. We can think of each tuple as a customer, and each relationasaproduct.Ourgoalistopredicthowthetupleratestherelation(rating0=false,rating1 =true),basedonobservedratingsin. Weinterpretc astheprobabilityp(y =1)wherey is r,t r,t r,t abinaryrandomvariablethatistrueiffhr,tiholds.Tothisendweintroduceaseriesofexponential familymodelsinspiredbygeneralizedPCA[8],aprobabilisticgeneralizationofPrincipleCompo- nentAnalysis. These modelswill estimate the conﬁdencein hr,ti using a naturalparameter θ r,t andthelogisticfunction:c :=p(y |θ ):= 1 . r,t r,t r,t 1+exp(−θr,t) Wefollow[9]andusearankingbasedobjectivefunctiontoestimateparametersofourmodels. Latent Feature Model One way to deﬁne θ is through a latent feature model F. We measure r,t compatibilitybetweenrelationrandtupletasadotproductoftwolatentfeaturerepresentationsof sizeKF: a forrelation r,andv fortuplet. ThisgivesθF := PKFa v andcorrespondsto r t r,t k r,k t,k theoriginalgeneralizedPCAthatlearnsalow-rankfactorizationofΘ=(θ ). r,t NeighborhoodModel We caninterpolatetheconﬁdenceforagiventupleandrelationbasedon thetruenessofothersimilarrelationsforthesametuple.InCollaborativeFilteringthisisreferredas aneighborhood-basedapproach[10]. WeimplementaneighborhoodmodelNviaasetofweights wr,r′, whereeachcorrespondsto a directedassociationstrength betweenrelationsr andr′. Sum- mingtheseupgivesθrN,t :=Pr′∈t\{r}wr,r′.1 Entity Model Relations have selectional preferences: they allow only certain types in their ar- gument slots. To capture this observation, we learn a latent entity representation from data. For each entity e we introduce a latent feature vector t ∈ Rl. In addition, for each relation r and e argumentslot i we introduce a feature vector d . Measuring compatibility of an entity tuple and i relationamountstosummingupthecompatibilitiesbetweeneachargumentslotrepresentationand thecorrespondingentityrepresentation: θE :=Parity(r)PKEd t . r,t i=1 k i,k ti,k Combined Models In practice all the above models can capture important aspects of the data. Hencewealsousevariouscombinations,suchasθN,F,E :=θN +θF +θE . r,t r,t r,t r,t 3 Experiments Doesreasoningjointlyacrossa universalschemahelpto improveovermoreisolated approaches? Inthefollowingweseektoanswerthisquestionempirically. Data Ourexperimentalsetupisroughlyequivalentto previouswork[2], andhencewe omitde- tails. To summarize, we consider each pair ht ,t i of Freebase entities that appear together in a 1 2 corpus. Itssetofobservedfacts correspondto: Extractedsurfacepatterns(inourcase lexicalized t dependencypaths)between mentionsof t and t , andthe relationsof t and t in Freebase. We 1 2 1 2 divideallourtuplesintoapproximately200ktrainingtuples,and200ktesttuples. Thetotalnumber ofrelations(patternsandfromFreebase)isapproximately4k. 1Notice that theneighborhood model amounts toa collectionof local log-linear classiﬁers, one foreach relationrwithweightswr. 2 Predicting Freebase and Surface Pattern Relations For evaluation we use two collections of relations:Freebaserelationsandsurfacepatterns.Ineithercasewecomparethecompetingsystems withrespecttotheirrankedresultsforeachrelationinthe collection. Our ﬁrst baseline is MI09, a distantly supervised classiﬁer based on the work of [1]. We also compareagainstYA11,aversionofMI09thatusespreprocessedpatternclusterfeaturesaccording to[7]. Thethirdbaselineis SU12,the state-of-the-artMulti-InstanceMulti-Labelsystemby[11]. Theremainingsystemsareourneighborhoodmodel(N),thefactorizedmodel(F),theircombination (NF)andthecombinedmodelwithalatententityrepresentation(NFE). Theresultsintermsofmeanaverageprecision(withrespecttopooledresultsfromeachsystem)are inthetablebelow: Relation # MI09 YA11 SU12 N F NF NFE TotalFreebase 334 0.48 0.52 0.57 0.52 0.66 0.67 0.69 TotalPattern 329 0.28 0.56 0.50 0.46 ForFreebaserelations,wecanseethataddingpatternclusterfeatures(andhenceincorporatingmore data) helps YA11 to improve over MI09. Likewise, we see that the factorized model F improves over N, again learning from unlabeled data. This improvementis bigger than the corresponding changebetweenMI09and YA11, possiblyindicatingthatourlatentrepresentationsare optimized directlytowardsimprovingpredictionperformance.Ourbestmodel,thecombinationofN,FandE, outperformsallothermodelsintermsoftotalMAP,indicatingthepowerofselectionalpreferences learnedfromdata. MI09,YA11andSU12aredesignedtopredictstructuredrelations,andsoweomitthemforresults onsurfacepatterns. Lookatourmodelsforpredictingtuplesofsurfacepatterns. Weagainseethat learning a latent representation (F, NF and NFE models) from additional data helps substantially overthenon-latentNmodel. Allourmodelsarefasttotrain. Theslowestmodeltrainsinjust30minutes. Bycontrast,training thetopicmodelinYA11alonetakes4hours. TrainingSU12takestwohours(onlessdata). Also noticethatourmodelsnotonlylearntopredictFreebaserelations,butalsoapproximately4ksurface patternrelations. 4 Conclusion Werepresentrelationsusinguniversalschemas. Suchschemascontainsurfacepatternsasrelations, aswellasrelationsfromstructuredsources. Wecanpredict missingtuplesforsurfacepatternrela- tionsandstructuredschemarelations.Weshowthisexperimentallybycontrastingaseriesofpopular weaklysupervisedmodelstoourcollaborativeﬁlteringmodelsthatlearnlatentfeaturerepresenta- tions across surface patterns and structured relations. Moreover, our models are computationally efﬁcient,requiringlesstimethancomparablemethods,whilelearningmorerelations. Reasoningwithuniversalschemasisnotmerelyatoolforinformationextraction. Itcanalsoserve asaframeworkforvariousdataintegrationtasks,forexample,schemamatching.Infutureworkwe alsoplantointegrateuniversalentitytypesandattributesintothemodel. References [1] Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. Distant supervision forrelation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual MeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcess- ing of the AFNLP (ACL ’09), pages 1003–1011.Association for ComputationalLinguistics, 2009. [2] SebastianRiedel,LiminYao,andAndrewMcCallum. Modelingrelationsandtheirmentions withoutlabeled text. In Proceedingsofthe EuropeanConferenceonMachineLearningand KnowledgeDiscoveryinDatabases(ECMLPKDD’10),2010. [3] Oren Etzioni, Michele Banko, Stephen Soderland, and Daniel S. Weld. Open information extractionfromtheweb. Commun.ACM,51(12):68–74,2008. 3 [4] DekangLinandPatrickPantel. DIRT-discoveryofinferencerulesfromtext. InKnowledge DiscoveryandDataMining,pages323–328,2001. [5] Patrick Pantel, Rahul Bhagat, BonaventuraCoppola, TimothyChklovski, and EduardHovy. ISP:LearningInferentialSelectionalPreferences. InProceedingsofNAACLHLT,2007. [6] AlexanderYatesandOrenEtzioni. Unsupervisedmethodsfordeterminingobjectandrelation synonymsontheweb. JournalofArtiﬁcialIntelligenceResearch,34:255–296,2009. [7] Limin Yao, Aria Haghighi, Sebastian Riedel, and Andrew McCallum. Structured relation discoveryusinggenerativemodels. In ProceedingsoftheConferenceonEmpiricalmethods innaturallanguageprocessing(EMNLP’11),July2011. [8] Michael Collins, Sanjoy Dasgupta, and Robert E. Schapire. A generalization of principal componentanalysistotheexponentialfamily. InProceedingsofNIPS,2001. [9] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. Bpr: Bayesianpersonalizedrankingfromimplicitfeedback. InProceedingsofUAI,2009. [10] Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborativeﬁltering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discoveryanddatamining,KDD’08,pages426–434,NewYork,NY,USA,2008.ACM. [11] Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D. Manning. Multi- instancemulti-labellearningforrelationextraction. InProceedingsofEMNLP-CoNLL,2012. 4

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.