Logout succeed
Logout succeed. See you again!

Application partitioning and mapping techniques for heterogeneous parallel platforms PDF
Preview Application partitioning and mapping techniques for heterogeneous parallel platforms
Universidad CarlosIIIde Madrid EscuelaPolite´cnicaSuperior PhDThesis Applicationpartitioningandmapping techniquesforheterogeneousparallel platforms Legane´sJune2016 Author: RafaelSotomayorFern´andez Advisor: J.DanielGarcı´aS´anchez 2 3 PhDThesis Applicationpartitioningandmappingtechniquesfor heterogeneousparallelplatforms Author: RafaelSotomayorFern´andez Advisor: J.DanielGarcı´aS´anchez Presidente D. ............................................................ Vocal D. ............................................................ Secretario D. ............................................................ Realizadoelactodedefensaylalecturadetesisen..............................eldı´a........ de..........................delao2016. Calificaci´on:.............................................. ElPresidente ElSecretario LosVocales 4 Abstract Inrecentyears,performancegainsprovidedbyclockandILPtechniqueshavecon- siderablysloweddown.Asaresult,parallelprogramminghasbecomethedominant programmingparadigmusedtoimproveperformanceinmulti-coredevices.Inline withthis,paralleluseofspecializedacceleratorshasstartedgainingimportance. However,adaptinglegacysourcecodeinordertomakeuseofthesetechnologiesis atimeconsuminganderrorpronetask,requiringspecialisedknowledge. ThemaingoalofthisThesisistosimplifythetaskoftransformingsequential legacycodeintoparallelcode.Thiscodewillbecapableofmakingfulluseofthe differentcomputingdevicesthatanheterogeneousparallelplatformcanhave,such asmodernCPUs,GPUs,FPGAs,andDSPs. Withthis,itispossibletoimprove sequentialcodebasedondifferentcriteria,suchastimeperformance. Asaresult,weproposeanarchitecturedescriptionlanguagetodescribehetero- geneousparallelplatforms. Wesuggestanewsoftwareannotationsyntaxtodescribe thebehaviourofthecodefromahigh-levelpointofviewwhilepreservingitsmain- tainability,alongwithautomaticannotationtechniques.Finally,weproposeaset oftaskpartitioningtechniquestosplitthecodeandexecuteitinparallelusingthe availablecomputingdevices. Resultsaimtodemonstratethattheproposedtech- niquescanbeappliedtodifferentacceleratordevicesandsourcecode,andthatthe chosenmetricsareimprovedwithrespecttotheoriginalsequentialcode. i ii Resumen Enlos´ultimosa˜nos,elrendimientoderivadodelafrecuenciaderelojydelas te´cnicasdeILPsehanreducidoconsiderablemente. Comoresultado,laprogra- macio´nparalelasehaconvertidoenelparadigmapredominantealahorademejo- rarelrendimientoendispositivosmulti-core. Araı´zdeesto,elusoenparalelode aceleradoresespecializadosaempezadoacobrarimportancia.Noobstante,adaptar c´odigolegadoparahacerusodeestastecnologı´asesunatareatediosayproclivea errores,paralaqueserequiereunconocimientomuyespecı´fico. ElobjetivoprincipaldeestaTesisessimplificarlatareadetransformarc´odigo legadosecuencialenc´odigoparalelo.Estec´odigopodr´autilizarlosdistintosdispos- itivosdec´omputodisponiblesenunaplataformaparalelaheteroge´nea,talescomo CPUs,GPUs,FPGAsoDSPs.Ası´,esposiblemejorarc´odigosecuencialenbasea distintoscriterios,comoelrendimiento. Comoresultado,seproponeunlenguajededescripci´ondearquitecturaspara describirplataformasparalelasheteroge´neas.Sesugiereunanuevasintaxisdeano- tacio´ndelsoftwareparadescribirelcomportamientodelprogramadesdeunpunto devistadealtonivel,preservandosumantenibilidad,ası´comote´cnicasdeanotaci´on autom´atica. Finalmente,seproponeunconjuntodete´cnicasdeparticionamiento paradividirelprogramayejecutarloenparalelohaciendousodelosdistintosdispos- itivosdec´omputo.Losresultadosbuscandemostrarquelaste´cnicaspropuestasse puedenaplicaradistintosaceleradoresyc´odigosfuente,yquelasme´tricasescogidas mejoranconrespectoalasdelc´odigosecuencialoriginal. iii iv Contents ListofFigures ix ListofTables xi 1 Introduction 1 1.1 Motivation................................. 1 1.2 Goals.................................... 2 1.3 Overview.................................. 3 1.4 Documentstructure............................ 4 2 Stateoftheart 7 2.1 Parallelprocessors ............................ 8 2.1.1 CPU................................ 8 2.1.2 Graphicprocessingunits..................... 9 2.1.3 Field-programmablegatearray .................10 2.2 Architecturedescriptionlanguages....................11 2.3 Parallelframeworks............................12 2.3.1 Library-basedprogrammingframeworks ............12 2.3.2 Languageextensions .......................14 2.4 Codeanalysisandtransformation....................16 2.5 Softwarepartitioningtechniques.....................19 2.6 Summary .................................21 3 A proposalforanewarchitecturedescriptionlanguage 23 3.1 Programmingmodel ...........................23 3.1.1 Executionmodel .........................23 3.1.2 Memorymodel ..........................24 3.2 Anewarchitecturedescriptionlanguage ................26 3.2.1 Hardwareparallelplatform....................26 3.3 Summary .................................34 v vi CONTENTS 4 Kernelidentificationandcodeannotation 35 4.1 Softwarecodeannotationspecification .................35 4.1.1 Annotationformat ........................36 4.1.2 Coreattributes..........................38 4.1.3 Data-relatedattributes......................43 4.1.4 High-levelparallelpatterns....................55 4.1.5 Utilityattributes.........................62 4.2 Automaticannotationtechniques ....................67 4.2.1 Workflow .............................68 4.2.2 Hotspotdetection.........................68 4.2.3 Hotspotselection.........................70 4.2.4 Preliminarykernelannotation..................71 4.2.5 Kernelselection..........................71 4.2.6 Attributeannotation.......................72 4.3 Summary .................................72 5 Staticpartitioningtechniques 73 5.1 Taskpartitioningalgorithm .......................73 5.1.1 Partitionphase..........................79 5.1.2 Schedulingphase.........................80 5.1.3 Executionphase..........................84 5.2 Summary .................................85 6 Evaluation 91 6.1 Referenceplatform ............................91 6.2 Evaluationofthesoftwareannotationtechniques ...........91 6.2.1 Benchmarks............................92 6.2.2 Results...............................93 6.3 Evaluationofthestaticpartitioningalgorithm.............93 6.3.1 Benchmarks............................94 6.3.2 Results...............................97 7 Conclusionsandfuturework 107 7.1 Contributions...............................107 7.2 Dissemination...............................108 7.3 Futurework................................111 Bibliography 113 Appendices 125