Logout succeed
Logout succeed. See you again!

Unix Domain Sockets Applied in Android Malware Should Not Be Ignored PDF
Preview Unix Domain Sockets Applied in Android Malware Should Not Be Ignored
informatio n Article Unix Domain Sockets Applied in Android Malware Should Not Be Ignored XuJiang* ID,DejunMuandHuixiangZhang ID SchoolofAutomation,NorthwesternPolytechnicalUniversity,Xi’an710072,China; [email protected](D.M.);[email protected](H.Z.) * Correspondence:[email protected] Received:8February2018;Accepted:2March2018;Published:4March2018 Abstract:Increasingly,maliciousAndroidappsusevariousmethodstostealprivateuserdatawithout theirknowledge. Detectingtheleakageofprivatedataisthefocusofmobileinformationsecurity. Aninitialinvestigationfoundthatnoneoftheexistingsecurityanalysissystemscantracktheflow of information through Unix domain sockets to detect the leakage of private data through such sockets, whichcanresultinzero-dayexploitsintheinformationsecurityfield. Inthispaper, we conductthefirstsystematicstudyonUnixdomainsocketsasappliedinAndroidapps. Then,we identifyscenariosinwhichsuchappscanleakprivatedatathroughUnixdomainsockets,which theexistingdynamictaintanalysissystemsdonotcatch. Basedontheseinsights,weproposeand implementJDroid,ataintanalysissystemthatcantrackinformationflowsthroughUnixdomain socketseffectivelytodetectsuchprivacyleaks. Keywords: Android;informationflows;Unixdomainsockets;privatedata;malware 1. Introduction Inthesecondquarterof2017,Androiddominatedthesmartphonemarket,garneringan86.1% share,accordingtoareportbytheInternationalDataCorporation. Meanwhile,therearesignificant numbersofAndroidappsthatenrichAndroidfeatures. Asmobiledeviceshavebecomeintegrated intodailylife,mobiledevicescollectincreasingamountsofprivatedata. Unfortunately,theAndroid operating system has been the target of increasing attacks by third-party apps [1–3], forming a widespread,seriouschallengebecauseappsalsoincreasinglyattempttostealprivatedata(e.g.,IMEI, and location) and send them to remote servers. This has become especially true in the era of big data[4,5]. Hackersuseseveraldifferentmethodstostealprivatedatawhileremainingundetectedby existinganalysissystems,includinginter-processcommunication(IPC)[6–10]. SinceAndroidisbasedonatailoredLinuxenvironment,itinheritsasubsetofthetraditionalLinux IPCsthatdifferfromAndroidIPCs[11–14]. AmongLinuxIPCsimplementedwithinAndroid,Unix domainsocketsaretheonlyoneappscaneasilymakeuseof. AlthoughGoogleencouragesAndroid developers to use Android IPCs, some still use Unix domain sockets, known as local sockets [15]. ThispracticeoccursnotonlybecauseusingUNIXdomainsocketsforIPCismoreefficientbutalso becauseAndroidIPCsareunsuitableforcommunicationbetweentheJavalanguageinwhichmost appsarewrittenandnativeprocesses/threads[16]. BoththeAndroidsoftwaredevelopmentkit(SDK) andtheAndroidnativedevelopmentkit(NDK)[17]provideAPIsforUnixdomainsockets. Tothebest ofourknowledge,howmaliciousappsexploitUnixdomainsocketshasnotyetbeensystematically studied. Afteranalyzed2600appsincluding1500normalappsand1100maliciousapps,315(21%) normalappsand209(19%)maliciousappshaveUnixdomainsocketrelatedAPIsorsystemcallsin code. Inaddition,APIsforUnixdomainsocketscanbeusedindifferentversionsofAndroidoperating systemsincludingthelatestversion(i.e.,Android8.0)accordingtothedocumentationontheofficial Information 2018,9,54;doi:10.3390/info9030054 www.mdpi.com/journal/information Information 2018,9,54 2of16 developerwebsite,whichisverifiedthroughtheimplementofmalwarestealingprivatedatabasedon Unixdomainsockets[18]. Whatismoreimportant,theexistingtaintanalysissystemsareunableto detectsuchleaks. Motivatedbytheseinsights,inthisstudy,weconductasystematicstudyoninformationflows throughUnixdomainsocketsandproposeandimplementJDroid,anefficientdynamictaintanalysis system that tracks information flows through Unix domain sockets. Because the tracking process involvestaintpropagationattheJavalevel,thenativelevelandbetweenboththroughJNI,JDroid reusessomeexistingmodulesfromTaintDroid[19]andNDroid[20]. TomakeJDroideffectiveandefficient,wehandleseveralchallengingissues,suchasthevarious approaches to the different types of Unix domain sockets, the differences between Android and Linux,etc. AnevaluationusingsampleappsemploysUnixdomainsocketstotransmitprivatedata andcircumventdetectionbyexistinganalysissystemsanddemonstratestheeffectivenessofJDroid in detecting private data leakage through Unix domain sockets. We further evaluate and report JDroid’sperformance. Therestofthispaperisorganizedasfollows. Section2introducesthebackground,relatedwork anddescribesundetectedinformationleakagescenariosthroughUnixdomainsockets. Wedetailthe design,implementation,andevaluationofJDroidinSections3–5. Finally,wereportthelimitationsof JDroidandweconcludethepaperinSection6. 2. Background 2.1. AndroidAppOverview InanAndroidsystem,appsarecommonlywrittenintheJavalanguageandcompiledintoJava bytecode,whichisthentranslatedtoDalvikbytecodeandstoredin.dexand.odexfilesthatexecute ontheproprietaryregister-basedDalvikvirtualmachine(DVM)[21]. AppsmaycontainbothJava and native components; these native components are simply shared libraries loaded dynamically atruntime[22]. TheinteractionbetweenJavacomponentsandnativecomponentsiswell-defined bytheJavaNativeInterface(JNI)specificationandsupportedbytheNDK[23]. Thelowestlevelof AndroidarchitectureisthecustomizedLinuxkernel,whichprovidesthebasicarchitecturalmodelfor processscheduling,resourcehandling,memorymanagement,networking,etc. Consequently,Linux mechanismappliedinAndroidsystemisworthresearchingseriously[24]. 2.2. UnixDomainSockets AUnixdomainsocketisadatacommunicationsendpointforexchangingdatabetweenprocesses executingonthesamehostoperatingsystem,andsuchsocketsareastandardcomponentofPOSIX operatingsystems[25,26]. TheAPIsforUnixdomainsocketsaresimilartothoseofInternetsockets; however,ratherthanusinganunderlyingnetworkprotocol,allcommunicationoccursentirelywithin the operating system kernel. Traditionally, there are three types of Unix domain socket address namespaces: FILESYSTEM,RESERVED,andABSTRACT.AnaddressintheFILESYSTEMnamespace isassociatedwithafileonthefilesystem. RESERVEDisinessenceasub-namespaceofFILESYSTEM, whileABSTRACTiscompletelyindependentoftheFILESYSTEM.TheprotocolfamilyintheUnix domainisAF_UNIX/LOCAL[27]. The socket type specifies the communication semantics. SOCK_STREAM type sockets are full-duplexbytestreamsthatmustbeinaconnectedstatebeforeanydatamaybesentorreceived throughit[28]. ThediagraminFigure1showsthecompleteclient/serverinteraction[29]. Inaddition,Unixdomainsocketssupportbothunorderedandunreliabledatagramtransmissions (i.e., SOCK_DGRAM) and ordered and reliable datagram transmission (i.e., SOCK_SEQPACKET, which is similar to SOCK_STREAM). As the diagram in Figure 2 shows, there is no flow control between the server and the client [30]. Each datagram message carries its destination address, its returnaddressandacertainamountofdata. ComparedwithSOCK_STREAMtypesockets,theserver Information 2018,9,54 3of16 neednotcreateasocketforlisteningviathe“listen”methodandthencallthe“accept”methodtowait fIInnoffroorrammacattoiioonnnn 22e001c188ti,, o99,n, xx. FFOORR PPEEEERR RREEVVIIEEWW 33 ooff 1155 SSeerrvveerr SSoocckkeett(( )) BBiinndd(( )) LLiisstteenn(( )) CClliieenntt AAcccceepptt(( )) SSoocckkeett(( )) bblloocckkss uunnttiill ccoonnnneeccttiioonn ffrroomm cclliieenntt CCoonnnneecctt(( )) ccoonnnneeccttiioonn eessttaabblliisshhmmeenntt WWrriittee(( )) ddaattaa((rreeqquueesstt)) RReeaadd(( )) RReeaadd(( )) ddaattaa((rreeppllyy)) WWrriittee(( )) CClloossee(( )) FFFiiiggguuurrreee 111...CCCooonnnnnneeeccctttiiiooonnn---bbbaaassseeeddd sssoooccckkkeeetttsss iiinnnttteeerrraaaccctttiiiooonnn... SSeerrvveerr SSoocckkeett(( )) CClliieenntt BBiinndd(( )) SSoocckkeett(( )) RReeaadd(( )) bblloocckkss uunnttiill ddaattaaggrraamm WWrriittee(( )) rreecceeiivveedd ffrroomm tthhee cclliieenntt ddaattaa ((rreeqquueesstt)) WWrriittee(( )) ddaattaa((rreeppllyy)) RReeaadd(( )) CClloossee(( )) FFFiiiggguuurrreee 222... DDDaaatttaaagggrrraaammm---ooorrriiieeennnttteeeddd sssoooccckkkeeetttsss iiinnnttteeerrraaaccctttiiiooonnn... 222...333... RRReeelllaaattteeeddd WWWooorrrkkk CCCuuurrrrrreeennntttlllyyy,,, t htthehees tassttteaa-ttoeef---ootffh--etthh-aeer--taamrrtt emtmheeottdhhoofoddr dffooertre cddtieenttegeccttthiinneggl eatthhkeea glleeeaoakkfaapggreiev aootffe dpparritivavaaitsteec addllaaettdaa “iidssy nccaaallmlleeiddc t““addiynynntaaammnaiiclcy ttsaaiisinn”tt a aannndaalliyyssstiiyss”p” iaacnanlddly iissu tstyyepdpiicicnaallillnyy f uourssmeedda tiiinno niinnsffoeocrrmumraaitttyiiootnno sseeenccfuuorrriicttyey ittnoof oeenrnmffooarrtccieeo niinnflffooorrwmmapattoiioolinnci effllsootwwo ppoolliicciieess ttoo pprreesseerrvvee ddaattaa ccoonnffiiddeennttiiaalliittyy aanndd iinntteeggrriittyy.. TTrraacckkiinngg iinnffoorrmmaattiioonn fflloowwss aalllloowwss uusseerrss ttoo kknnooww hhooww aa pprrooggrraamm pprroocceesssseess pprriivvaattee ddaattaa [[3311,,3322]].. TTaaiinnttDDrrooiidd iiss aa pprroommiinneenntt rreepprreesseennttaattiivvee ooff aapppplliiccaattiioonnss tthhaatt ttrraacckk iinnffoorrmmaattiioonn fflloowwss iinn AAnnddrrooiidd ddyynnaammiicc ttaaiinntt aannaallyyssiiss ssyysstteemmss,, wwiitthh ssoommee 22331122 cciittaattiioonnss aatt tthhee ttiimmee ooff tthhiiss wwrriittiinngg.. MMaannyy eexxiissttiinngg aannaallyyssiiss ssyysstteemmss ssuucchh aass DDrrooiiddbbooxx [[3333]] aanndd AAppppFFeennccee [[3344]] hhaavvee ddeevveellooppeedd nneeww ffuunnccttiioonnaalliittyy bbaasseedd oonn TTaaiinnttDDrrooiidd.. Information 2018,9,54 4of16 preservedataconfidentialityandintegrity. Trackinginformationflowsallowsuserstoknowhowa programprocessesprivatedata[31,32]. TaintDroidisaprominentrepresentativeofapplicationsthat trackinformationflowsinAndroiddynamictaintanalysissystems,withsome2312citationsatthe timeofthiswriting. ManyexistinganalysissystemssuchasDroidbox[33]andAppFence[34]have developednewfunctionalitybasedonTaintDroid. By modifying the Android application framework and the DVM, TaintDroid stores a 32-bit bitvector with each variable to encode a taint tag, supporting 32 different taint markers. As with local method variables and arguments, TaintDroid allocates taint tag storage by doubling the size ofthestackframeallocation,andtainttagsarestoredadjacenttoclassfieldsandarraysinsidethe VMinterpreter’sdatastructures. TaintDroidpropagatesthetainttagswhentheappisrunning,and monitorswhetheroutgoingdatahasatainttagintheJavalayer. However, TaintDroidonlyloads nativelibrariesfromthefirmware: itdoesnotapplytothoseincludedinthird-partyapps. Currently, hackers increasingly use native code in their malicious apps to hide the program logic[35–37]. Thus,researchershavebeguntopaymoreattentiontothesecurityofsuchthird-party nativelibraries. Somesystemsusetoolssuchasptrace[38],strace[39],andltrace[40]tocollectthe system-call sequences made by these libraries and use that information to analyse malicious app behavior. CopperDroid[41]collectsandanalysessystemcallsacquiredbyinstrumentingQEMUand appbehaviorsrelatedtoBinder. DroidScope[42]tracksinformationflowsbyreconstructingboththe OS-levelandJava-levelsemanticssimultaneouslyandseamlessly,althoughitislessefficient. NDroid tracks information flows across the boundary between the Java and native layers by instrumenting important JNI-related methods, and it monitors native code by processing each ARM/Thumbinstruction. ToworkseamlesslywithTaintDroid,NDroidreusesthemodulesmodified byTaintDroid,andtaintsaddedbyNDroidfollowTaintDroid’sformatintheDVM.NDroidleverages shadowregistersandmemorytosavethetaintsduringnative-layerexecutionandsetsthetaintsin theDVMstacksoitcanrefertothemwhenthetaintsarepropagatedtotheJavalayer. Unfortunately, to the best of our knowledge, none of the existing dynamic taint-tracking approachesconsiderinformationleaksthroughUnixdomainsockets. 2.4. ThreatModelandAssumptions AndroidappsusingAPIsforUnixdomainsocketsneedonlyrequestInternetpermission[43], whichissocommonlyusedthatusersarenotwaryaboutgrantingit. However,oncetheapphas beengrantedInternetpermission,theAPIsforUnixdomainsocketscanbeexecutedineitherJavaor nativecode. Inthissection,weanalyzethescenariosinwhichprivatedatacanbetransmittedthroughUnix domainsocketsandthenleakedandexplainwhysuchleakagecannotbedetectedbyexistingsystems. WeuseTaintDroidandNDroidasreferences,becausetheyarebothadvanceddynamictaintanalysis systemsandopensource. Aninformationflowfromasourcetoasinkisthemainrequirementforleakingprivatedata. We considerthesourcetobetheAPIsthatcanacquireprivatedataandthesinktobeanyAPIsthatcan sendprivatedataoutoftheAndroidsystem(usuallythenetworkinterface). Dynamictaintanalysis systemstrackhowlabeleddataimpactotherdatainwaysthatmightleakprivatedata[44]. Private dataisfirstidentifiedatthesource,whereatainttagindicatesthedatatypeassigned. Later,thedata willbecheckedwhenitgetssenttothesink. Thus, eliminatingthetainttagsattachedbyexisting systemshasraisedconcernswithinthehackercommunity. Tobetterunderstandthethreatmodel,we definetheclientandtheserverasthesendingandreceivingends,respectively,andgrouptheapps thatemployUnixdomainsocketstotransmitprivatedataintothreecases,dependingontheclient andserverlocationsintheAndroidarchitecture,asshowninTable1. Information 2018,9,54 5of16 Table1.Theclient/servercombinationsininformationflowsthroughUnixdomainsockets. Client Java Native Java Case1 Case3 Server Native Case3 Case2 Case1. AsshowninFigure3a,inthiscase,theclientandtheserverarebothlocatedintheJava Information 2018, 9, x FOR PEER REVIEW 5 of 15 layer. First,privatedataistransmittedfromtheclienttotheserver,whichisacquiredbyinvokingthe stohuisrc ec.asNee, xtth,et hseesrevrevr earnpdro tchees secslieitnat nadret hcernealeteadk ssiotltehlryo ubgyh Athnedrsoinidk .AItPisIsn outseinwgo rstpheyctifhiaetd, innatmhiess ccaosne,fitnheeds etrov ethrea nLdinthuex calibesntrtaacrte ncraemateesdpascoele. lyTabiyntADnrodirdo idanAdP NIsDursoinidg scpanecniofite ddentaemct essuccohn filenaekdagtoe, thbeecLaiunsuex thaebyst draoc ntonta mcoensspidaecer .taTianitn ptDroropiadgaatniodnN thDrroouidghc aUnnnioxt ddoemteacitns usocchkleetask iang teh,eb Jeacvaau lsaeytehre. ydo notcoCnasisdee 2r.t aAins tsphroowpang iant ioFnigtuhrreo u3bg,h iUn nthixisd coamsea,i nthseo ccklieetnsti nanthde tJhaev aselravyeerr .are both located in the nativCea slaey2e.r.A Tshseh oapwpn iinnvFoikgeusr eth3eb s,oinurthceis tcoa fseet,cthh eprcilvieantet adnadtat,h wehseicrhv eirs athreenb otrthanlosmcaittetdedin toth tehen actliivenet lainy etrh.eT nhaetiavpep lainyvero kthersotuhgehs oJuNrIc. eTthoef ectlicehnpt rpirvoacteesdseast at,hwe hdiactha iasntdh etnratnrasmnsimts iittt etdo ttohet hseercvlieern tthirnouthgeh nUatnivixe dlaoymearinth sroocukgehtsJ.N FiIn.aTlhlye, cnlaietinvtep croodcee scsaens ltehaekd thatea daantad dtriraencstmlyi tbsyi ctatlolitnhge tsheer vPeOrStIhXr osuocgkheUt AnPixI. domaTinaisnotcDkreotsid. F ciannanlloy,t ndaettievcet csoudcehc laenaklesa bketchaeudsea tiat dloiraedcst lyonblyy ctahleli nagppth, enPotO tShIeX tshoicrkde-ptaArPtyI. native libraTrayi.n MtDorroeiodvcearn, nito dtodeest encotts ucocnhsliedaekr slobceacla usoscekiettlso. aNdDsroonildy mthiessaeps ps,uncho tlethaekst hbierdca-puaser tyit ncaantinvoet litbraracrky .inMfoorrmeoavtieorn,i tfldoowe sthnrootucgohn sUidneixr ldoocamlasionc ksoetcsk.eNtsD; crooindsemqiusesenstlsyu icth relecaekivsebse tchaeu sinefiotrcmanantiootnt rfraockm inmfoermmoartyio nwflitohwoutth rcoourgrhesUpnonixddinogm ataininsto ctkaegtss.; cMonossetq uexeinsttliyngit raenceailvyessist hseyisntefomrms ahtaiovne frtohme mcoemmmoroyn wviuthlnoeurtacboilrirteys pthoantd tihnegy tiaginnotrtaeg PsO. MSIoXs tsoecxkisetti nAgPaIsn aalsy tshise ssyinstke.m shavethecommonvulnerabilitythat theyigCnaosree 3P. OASsI Xshsoowckne tinA PFIisguasret h3ec,s iinnk t.his case, the client and the server are located in different layeCrsa: stehe3 .dAirsecsthioonw onfi innFfoigrmuraeti3ocn, ifnlotwhi scocausled, bthee frcolimen tthaen Jdavthae lasyerevr etroa trheel nocaatitveed lianyderif,f oerre vnitclea vyeerrss:a. thIne tdhiere Jcatvioan laoyfeirn,f tohrem caltiieonnt flseonwdsc opurilvdabtee dfraotma ttoh ethJea vsearlvaeyre rint othteh enantaitviev elalyayere rt,horrouvigche Uvenrisxa .dIonmthaien Jasovcakleatyse. rT,htheen,c Jliaevnat csoednde sfeptrcihveast eprdiavtaateto dtahtea sfreorvme rthine nthaetinvae tliavyeelra tyherrotuhgrohu JgNhI Uanndix ledaokms aiti nthsroocukgehts a. TJhaevna, sJainvka coord neafteivtceh ecosdper ivpartoecedsasteasf, rwomhitchhe tnhaetniv leealaky ethret hdroatuag dhiJrNecItalyn dthlreoaukgshit PthOroSuIXg hsoacJkaevta AsiPnIks. oTraninattiDvreocidod aenpd rNocDersoseids, cwanhnicoht dtheetenclte saukchth leeadkaataged fiorerc tthlye tsharmoeu grehaPsoOnSsI Xas sionc CkeatseA 2P.I s. TaintDroid andNDroidcannotdetectsuchleakageforthesamereasonsasinCase2. Java layer Java layer send privacy data out Client LocalSocket Server Sinks native layer native code sends out privacy data native layer Client LocalSocket Server Sinks Client (a) (b) Java layer Java code sends out Client privacy data Sinks et k c o S native layer cal native code sends out o L privacy data directly Server Sinks (c) Figure 3. Examples of private data leakage through Unix domain sockets. (a) Case 1; (b) Case 2; Figure 3. Examples of private data leakage through Unix domain sockets. (a) Case 1; (b) Case 2; (c) Case 3. (c)Case3. 3. Design and Implement Since JDroid must handle the taint propagation in the DVM and across the Java and native layers through JNI, we re-use some modules from TaintDroid and NDroid. Figure 4 illustrates the JDroid architecture. The general goal of JDroid is to track information flows through Unix domain sockets and detect private data leakage. Information 2018,9,54 6of16 3. DesignandImplement SinceJDroidmusthandlethetaintpropagationintheDVMandacrosstheJavaandnativelayers throughJNI,were-usesomemodulesfromTaintDroidandNDroid. Figure4illustratestheJDroid architecture. ThegeneralgoalofJDroidistotrackinformationflowsthroughUnixdomainsockets aIInnnffodorrmmdaaettitiooennc 2t200p118r8,i, v 99,a, xxt e FFOOdRRa t PPaEEElEeRRa RkREaEVgVIeIEE.WW 66 ooff 1155 JJaavvaa CCoommppoonneenntt JJaavvaa LLiibbrraarriieess DDVVMM UUnniixx ddoommaaiinn ssoocckkeettss JJaavvaa--lleevveell vviieeww HHooookk DDaallvviikk vvmm EEnnggiinnee JJNNII UUnniixx ddoommaaiinn ssoocckkeettss IInnssttrrnnuuccttiioo TTaaiinntt TTrraaccee EEnnggiinnee UUnniixx ddoommaaiinn ssoocckkeettss SSyysstteemm NNaattiivvee CCoommppoonneenntt OOSS--lleevveell vviieeww LLEiEibbnn gHgHiiononoeoekk SSyysstteemm LLiibbrraarriieess LLiinnuuxx KKeerrnneell QQEEMMUU OOSS--lleevveell vviieeww RReeccoonnssttrruuccttoorr FFFiiiggguuurrreee 4 44... JJJDDDrrroooiiiddd ooovvveeerrrvvviiieeewww... 333..1.11...TTTaaaiiinnnttt PPPrrrooopppaaagggaaatttiiiooonnn DDDeeevvveeelllooopppeeerrrsss c cacaannnu usuesseeb obbtoohtthJha vJJaaavvcaao dccoeoddaeen daannndda t innvaaettiicvvoeed eccootdodeei m ttoop lieimmmppellneetmmceoenmntt m ccouommnimmcauutinnoiinccsaatttiihoornnossu gtthhhrroUouunggixhh dUUonnmiixxa iddnoosmmocaakiinne t ssso.occFkkoeerttsJs.D. FFrooorird JJ,DDarrcoohiidda,l, l aea n ccghhianallglleeinnsgsguiinneggi siisshssouuwee iitsso hhcooowrwr e ttcoot lcycooerrnrreesccuttrllyey teeanninssuutrpreer ottaapiianngtt a pptrirooonppaadgguaarttiiinoognn tddhuuerrpiinnroggc ttehhsees poprfroopccreeisvsssa otoeff dpparritivavaattrteea nddsaamttaai sttrsraiaonnnss.mmiissssiioonn.. TTTooo t tataaccckkkllelee t ththhiisiss i isisssssuuueee,,, J JDJDDrrroooiididdc ccrrreeeaaatteteesssa aa s ssttrtrruuucccttutuurrereec ccaaallllelleedddN NNooodddeeettthhhaaattt rrreeecccooorrrdddsss t tthhheee pppaaattthhhnnnaaammmeeesss o oofff s ssoooccckkkeeetttsss aaannnddd t ththheee c ccooorrrrrereesspsppooonnndddiniinngggt attaianiintnttt a ttgaagsg,ssa,, naadnnddu suuessseeass laai s lltiissttot ttsoot o ssrtteoortrehe etthhNeeo NNdeooddsteer usstctrrtuuucrctetuu.rAree.s. AAshsso sswhhnoowwinnnF iiinng u FFriieggu5u,rreteh 55e,, Ntthhoeed NNeosotddreue cssttturruureccttiuunrrceel u iindnccellsuutddaeeinss ttt,aatiihnnett,, t ttahhiene tttaataiinngtt f ttoaarggs ffeoonrrd ssieennngdddiinnaggta dd;aaattsaar;; c aan sasrrmccnnea,amwmehe,i, cwwhhhiisicchthh ieiss p tthhaeteh ppnaaattmhhnneaaommfeteh ooeff sttehhneed ssieennngddeiinnngdg ;eeannnddd;; aaannddds taan ddasmsttnnea,amtmheee,, p tthaheteh ppnaaattmhhnneaaommfeteh ooeff r ttehhceee irrveeiccneegiivveiinnndgg . eenndd.. ttyyppeeddeeff ssttrruucctt NNooddee{{ iinntt ttaaiinntt;; cchhaarr ssrrccnnaammee[[225566]];; cchhaarr ddssttnnaammee[[225566]];; ssttrruucctt nnooddee **nneexxtt;; }}NNooddee Figure5.The“Node”structure. FFiigguurree 55.. TThhee ““NNooddee”” ssttrruuccttuurree.. 3.2. TheHandleintheSendingEnd 33..22.. TThhee HHaannddllee iinntthhee SSeennddiinngg EEnndd Priortosending,eachAPIrelatedtosendingdatainboththeJavaandnativelibrariesistasked PPrriioorr ttoo sseennddiinngg,, eeaacchh AAPPII rreellaatteedd ttoo sseennddiinngg ddaattaa iinn bbootthh tthhee JJaavvaa aanndd nnaattiivvee lliibbrraarriieess iiss ttaasskkeedd withcreatingaNode. wwiitthh ccrreeaattiinngg aa NNooddee.. CCoonnnneeccttiioonn--bbaasseedd ssoocckkeettss.. IInn tthhee ffiirrsstt ccaassee,, tthhee cclliieenntt iiss llooccaatteedd iinn tthhee JJaavvaa llaayyeerr.. SSiinnccee TTaaiinnttDDrrooiidd uusseess mmeessssaaggee--lleevveell ttaaiinntt ttrraacckkiinngg tthhaatt rreepprreesseennttss tthhee uuppppeerr bboouunndd ooff tthhee ttaaiinntt ttaagg aassssiiggnneedd ttoo vvaarriiaabblleess ccoonnttaaiinneedd iinn tthhee mmeessssaaggee,, JJDDrrooiidd aallssoo aaddooppttss mmeessssaaggee--lleevveell ttaaiinntt ttrraacckkiinngg aanndd ccrreeaatteess aa NNooddee ttoo rreeccoorrdd tthhee rreelleevvaanntt iinnffoorrmmaattiioonn.. UUssiinngg tthhee JJaavvaa AAPPII ““wwrriittee”” aass aann eexxaammppllee,, FFiigguurree 66 sshhoowwss hhooww tthhee ffuunnccttiioonnss aarree ccaalllleedd aanndd hhooww ttoo mmooddeell iittss ttaaiinntt pprrooppaaggaattiioonn ooppeerraattiioonn.. FFiirrsstt,, tthhee iinnssttrruummeenntteedd ccooddee iinn ““lliibbccoorree..iioo..PPoossiixx..wwrriittee”” oobbttaaiinnss ttaaiinntt aanndd iinnvvookkeess tthhee nnaattiivvee mmeetthhoodd ““aaddddTTaaiinnttFFiillee”” ttoo ccrreeaattee tthhee NNooddee aanndd iinnsseerrtt tthhee eennttrryy iinn tthhee NNooddee lliisstt.. NNoottee tthhaatt tthhee AARRMM//TThhuummbb pprroocceedduurree ccaallll ssttaannddaarrdd ddeeffiinneess tthhaatt tthhee ffiirrsstt ffoouurr Information 2018,9,54 7of16 Connection-basedsockets.Inthefirstcase,theclientislocatedintheJavalayer.SinceTaintDroid usesmessage-leveltainttrackingthatrepresentstheupperboundofthetainttagassignedtovariables containedinthemessage,JDroidalsoadoptsmessage-leveltainttrackingandcreatesaNodetorecord therelevantinformation. UsingtheJavaAPI“write”asanexample,Figure6showshowthefunctionsarecalledandhowto modelitstaintpropagationoperation. First,theinstrumentedcodein“libcore.io.Posix.write”obtains taint andinvokes the nativemethod “addTaintFile” tocreatethe Node andinsert the entryin the Information 2018, 9, x FOR PEER REVIEW 7 of 15 InNforomdateiolni s2t0.1N8, 9o,t xe FtOhRa tPtEhEeR ARERVMIE/WT h umbprocedurecallstandarddefinesthatthefirstfourparam7 oeft 1e5r s apraerapmasesteedrsi narRe0 ptaosRse3d, winh Rile0 tthoe Rr3em, waihniilne gthpea rraemmeatienrisngar peapruasmheetderosn atores tpaucskh,eadnd otnhteo rsettaucrkn, avnadlu ethies parameters are passed in R0 to R3, while the remaining parameters are pushed onto stack, and the prelatucernd vinalRu0e. is placed in R0. return value is placed in R0. OutputStream.write(byte[] buffer) Outputstream OutputStream.write(byte[] buffer) Outputstream IoBridge.write(FileDescriptor fd, byte[]bytes, int byteOffset, int byteCount IoBridge.write(FileDescriptor fd, byte[]bytes, int byteOffset, int byteCount IoBridge IoBridge int write(FileDescriptor fd, ByteBuffer buffer}{ int write(FileDescriptor fd, ByteBuffer buffer}{ int taint = buffer.getDirectByteBufferTaint(); int taint = buffer.getDirectByteBufferTaint(); if (taint != Taint.taint_clear){ if (taint != Taint.taint_clear){ native public static void addTaintFile(int fd, int taint);} native public static void addTaintFile(int fd, int taint);} Posix Posix Static void Dalvik_dalvik_system_Taint_addTaintFile(const u4* args, JValue* pResult) Static void Dalvik_dalvik_system_Taint_addTaintFile(const u4* args, JValue* pResult) { { int fd=(int)args[0];//args[0]= file descriptor int fd=(int)args[0];//args[0]= file descriptor libcore_io_Posix u4 taint = args[1]; // args[1] = the taint tag libcore_io_Posix u4 taint = args[1]; // args[1] = the taint tag (JNI) struct sockaddr_un src_addr; (JNI) struct sockaddr_un src_addr; socklen_t src_addr_len; socklen_t src_addr_len; getsockname(sockfd,(struct sockaddr*)&src_addr, &sock_src_len); getsockname(sockfd,(struct sockaddr*)&src_addr, &sock_src_len); struct sockaddr_un dst_addr; struct sockaddr_un dst_addr; socklen_t dst_addr_len; socklen_t dst_addr_len; Kernel getpeername(sockfd,(struct sockaddr*)&dst_addr, &dst_src_len); Kernel geitnpseeerrtnDaamtae((psaotchkTfadi,n(stltirsutc, ti nsto ctakiandt,d srr*c)_&addsdtr_.saudnd_r,p &atdhs, td_sstr_ca_dlednr.)s;un_path);} insertData(pathTaintlist, int taint, src_addr.sun_path, dst_addr.sun_path);} FiFFgiiuggruuerr ee6 .66 T.. hTTehh e“e w““wwritrreiitt”ee ””p rpporrpooappgaaaggtaaiottiinoo nno pooeppreearrtaaiottiinoo.nn .. In the second case, the client is located in native layer. JDroid adopts NDroid’s method to track InI nthteh esesceocnodn dcacsaes,e t,hteh eclcielinetn itsi slolcoactaetde dini nnantaivtiev elalyaeyre. rJ.DJDroriodi dadaodpotpst sNNDDroriodi’ds’ smmetehtohdo dtot otrtarcakc k the propagation of private data through JNI and monitor native code using shadow memory and thteh eprporpoapgaagtaiotino nofo fprpirviavtaet edadtaat athtrhoruoguhg hJNJNI Ianadn dmmonointoitro rnnataitviev ecocdode euusisnign gshsahdadowow mmememoroyr yanadn d registers so that the corresponding taint tag always follows the private data. Consequently, JDroid rergeigsitsetresr ssoso ththaat ttthhee ccoorrrreessppoonnddiinngg ttaaiinntt ttaagg aallwwaayyss ffoolllolowwsst htheep prirvivataeted adtaat.aC. Conosnesqeuqeunetnlyt,lyJD, JrDoirdoicda n can retrieve a taint tag from memory in the native layer, check whether the data being sent includes carnet rreietvrieevaet aai ntatitnatg tfargo mfromme mmoermyoinryt hine tnhaet invaetliavyee lra,ychere,c ckhwechke twhheretthheerd tahtea dbaetina gbeseinngt isnecnlut dinecsluadtaeisn t a taint tag and create an appropriate entry in the Node list. The rest of the process is identical to that a ttaaginat ntadgc raenadte craenaatep apnro apprpiartoeperniatrtey einnttrhye inN tohdee Nlisotd.eT hliestr. eTshteo frethste opfr tohcee spsriosciedses nisti cidaelntoticthaal ttofo trhatht e for the Java layer. foJra vthael aJayvear. layer. Datagram-oriented sockets. Datagram-oriented sockets use datagram communications DDataatgargarmam-o-roireinetnetded ssoocckkeettss.. DDataatgargarmam-o-roireinetnetdedso cskoectkseutsse duastea grdaamtagcorammm ucnoimcamtiounnsicbaettiwonese n between one server and several clients. A datagram-oriented socket provides a symmetric data beotnweeseenr voenrea snedrvseerv earnadl csleievnetrsa.l Acliednattsa.g Ara mda-otargiernatmed-osroiecnkteetdp sroocvkidete sparosvyimdems eat rsicymdamtaeterixcc hdaantag e exchange interface without requiring a connection to be established. The sending behavior is exinchtearnfagcee iwntiethrfoaucte rewqiuthiroinugt arecqouninriencgti oan ctoonbneeecsttiaobnl istoh ebde. Tehsetasbelnisdhiendg. bTehhea vsieonrdisinimg pbleehmaevnioterd ibs y implemented by the “sendto” and “sendmsg” methods, as shown in Figure 7. imthpele“mseenndtetod” bayn tdhe“ s“esnednmdtsog”” amndet h“soednsd,masssgh”o mwenthinodFsig, uasr esh7.own in Figure 7. ssize_t sendto(intsockfd, const void *buf, size_tlen, int flags, conststructsockaddr *dest_addr, socklen_taddrlen) ssize_t sendto(intsockfd, const void *buf, size_tlen, int flags, conststructsockaddr *dest_addr, socklen_taddrlen) ssize_t sendmsg(intsockfd, conststructmsghdr *msg, int flags) ssize_t sendmsg(intsockfd, conststructmsghdr *msg, int flags) FFiigguurree 77.. TThhee ““sseennddttoo”” aanndd ““sseennddmmssgg”” mmeetthhooddss.. Figure 7. The “sendto” and “sendmsg” methods. Although Google provides Android APIs (e.g., LocalServerSocket, LocalSocket) for developers Although Google provides Android APIs (e.g., LocalServerSocket, LocalSocket) for developers to use with Unix domain sockets, they are not available for datagram-oriented sockets except when to use with Unix domain sockets, they are not available for datagram-oriented sockets except when the client invokes the method “connect” to establish the connection relation. If using “connect” to the client invokes the method “connect” to establish the connection relation. If using “connect” to connect the server, the client can use the method “send” to send the data. Following that path, the connect the server, the client can use the method “send” to send the data. Following that path, the handle is the same as in connection-based sockets. handle is the same as in connection-based sockets. Briefly, the SOCK_DGRAM type of Unix domain socket that does not invoke “connect” must be Briefly, the SOCK_DGRAM type of Unix domain socket that does not invoke “connect” must be implemented in native code. Therefore, JDroid hooked the “sendto” and “sendmsg” methods. For implemented in native code. Therefore, JDroid hooked the “sendto” and “sendmsg” methods. For “sendto”, JDroid parses the second parameter (i.e., buf) to check whether the data being sent has a “sendto”, JDroid parses the second parameter (i.e., buf) to check whether the data being sent has a taint tag and the fifth parameter (i.e., dest_addr) to obtain the dstname. For “sendmsg”, it uses the taint tag and the fifth parameter (i.e., dest_addr) to obtain the dstname. For “sendmsg”, it uses the msghdr structure to minimize the number of directly supplied arguments, as shown in Figure 8. msghdr structure to minimize the number of directly supplied arguments, as shown in Figure 8. Information 2018,9,54 8of16 AlthoughGoogleprovidesAndroidAPIs(e.g.,LocalServerSocket,LocalSocket)fordevelopersto usewithUnixdomainsockets,theyarenotavailablefordatagram-orientedsocketsexceptwhenthe clientinvokesthemethod“connect”toestablishtheconnectionrelation. Ifusing“connect”toconnect theserver,theclientcanusethemethod“send”tosendthedata. Followingthatpath,thehandleis thesameasinconnection-basedsockets. Briefly,theSOCK_DGRAMtypeofUnixdomainsocketthatdoesnotinvoke“connect”must be implemented in native code. Therefore, JDroid hooked the “sendto” and “sendmsg” methods. For“sendto”,JDroidparsesthesecondparameter(i.e.,buf)tocheckwhetherthedatabeingsenthasa tainttagandthefifthparameter(i.e.,dest_addr)toobtainthedstname. For“sendmsg”,itusesthe Information 2018, 9, x FOR PEER REVIEW 8 of 15 msghdrstructuretominimizethenumberofdirectlysuppliedarguments,asshowninFigure8. struct msghdr { Void *msg_name; //optional address socklen_t msg_namelen; //size of address struct iovec *msg_iov; // scatter size_t msg_iovlen; // elements in msg_iov void *msg_control; //ancillary data size_t msg_controllen; //ancillary data buffer len int msg_flags; // flags on received message }; Figure8.The“msghdr”struct. Figure 8. The “msghdr” struct. The data being sent is pointed to by the elements of themsg.msg_iov array, and dstname is The data being sent is pointed to by the elements of the msg.msg_iov array, and dstname is pointed to by msg.msg_name. Further, the Srcname can also be obtained by invoking pointedtobymsg.msg_name. Further,theSrcnamecanalsobeobtainedbyinvoking“getsockname”. “getsockname”. Based on these data items, JDroid creates a new Node. Basedonthesedataitems,JDroidcreatesanewNode. 3.3. TheHandle inthe Receiving End 3.3. TheHandleintheReceivingEnd JDroid initializes the taint tag for tracking an information flow entering the server using two JDroid initializes the taint tag for tracking an information flow entering the server using two steps. The first step determines which Node should get the taint tag based on the peer’s pathname steps. ThefirststepdetermineswhichNodeshouldgetthetainttagbasedonthepeer’spathname (i.e., srcname) and its own pathname (i.e., dstname). The second step attaches the taint to the (i.e.,srcname)anditsownpathname(i.e.,dstname). Thesecondstepattachesthetainttothereceived received data, which can used to continue tracking taint propagation until it reaches a sink. data,whichcanusedtocontinuetrackingtaintpropagationuntilitreachesasink. Two approaches are required to obtain the peer’s pathname because of the differences between Twoapproachesarerequiredtoobtainthepeer’spathnamebecauseofthedifferencesbetween connection-based sockets and datagram-oriented sockets. connection-basedsocketsanddatagram-orientedsockets. Connection-based sockets. For communications between connection-based sockets (whether in Connection-based sockets. For communications between connection-based sockets (whether the Java or native layer), the server looks for the taint tag (i.e., taint) of the corresponding Node intheJavaornativelayer),theserverlooksforthetainttag(i.e.,taint)ofthecorrespondingNode based on the methods that JDroid invokes: “getpeername” to obtain the client’s pathname and based on the methods that JDroid invokes: “getpeername” to obtain the client’s pathname and “getsockname” to get itself pathname. If the client does not bind the specified pathname, the return “getsockname”togetitselfpathname. Iftheclientdoesnotbindthespecifiedpathname,thereturn value of “getpeername” may be NULL, differing from Linux, which returns the pathname allocated valueof“getpeername”maybeNULL,differingfromLinux,whichreturnsthepathnameallocatedby by kernel. Therefore, JDroid confirms that the corresponding Node is based not only on the kernel. Therefore,JDroidconfirmsthatthecorrespondingNodeisbasednotonlyonthepathnamesof pathnames of the client and the server, but also the sequence of a Node in the list based on its theclientandtheserver,butalsothesequenceofaNodeinthelistbasedonitscreationtime. Then, creation time. Then, JDroid associates the taint with the received data. JDroidassociatesthetaintwiththereceiveddata. Taking the Java API “read” as an example, JDroid first obtains the taint tag based on the Node, TakingtheJavaAPI“read”asanexample,JDroidfirstobtainsthetainttagbasedontheNode, which goes back to the Java layer and was attached to the received data. Figure 9 shows how to whichgoesbacktotheJavalayerandwasattachedtothereceiveddata. Figure9showshowtomodel model the taint propagation operation. thetaintpropagationoperation. InputStream.write(byte[] buffer) IoBridge.read(FileDescriptor fd, byte[]bytes, int byteOffset, int byteCount) int read(FileDescriptor fd, ByteBuffer buffer}{ int taint = native public static int getTaintFile(fd); Taint.addTaintByteArray((byte[])buffer, tag);} private native int getTaintFile(FileDescriptor fd){ int recvfd = fd.getDescriptor(); struct sockaddr_un sock_addr; socklen_t sock_addr_len; getpeername(recvfd,(struct sockaddr*)&sock_addr, &sock_addr_len); //get the path of the sending end int taint=Node*findtag(sock_addr.sun_path); //look for the taint tag Return tag; } Figure 9. “read” taint operation. Information 2018, 9, x FOR PEER REVIEW 8 of 15 struct msghdr { Void *msg_name; //optional address socklen_t msg_namelen; //size of address struct iovec *msg_iov; // scatter size_t msg_iovlen; // elements in msg_iov void *msg_control; //ancillary data size_t msg_controllen; //ancillary data buffer len int msg_flags; // flags on received message }; Figure 8. The “msghdr” struct. The data being sent is pointed to by the elements of themsg.msg_iov array, and dstname is pointed to by msg.msg_name. Further, the Srcname can also be obtained by invoking “getsockname”. Based on these data items, JDroid creates a new Node. 3.3. TheHandle inthe Receiving End JDroid initializes the taint tag for tracking an information flow entering the server using two steps. The first step determines which Node should get the taint tag based on the peer’s pathname (i.e., srcname) and its own pathname (i.e., dstname). The second step attaches the taint to the received data, which can used to continue tracking taint propagation until it reaches a sink. Two approaches are required to obtain the peer’s pathname because of the differences between connection-based sockets and datagram-oriented sockets. Connection-based sockets. For communications between connection-based sockets (whether in the Java or native layer), the server looks for the taint tag (i.e., taint) of the corresponding Node based on the methods that JDroid invokes: “getpeername” to obtain the client’s pathname and “getsockname” to get itself pathname. If the client does not bind the specified pathname, the return value of “getpeername” may be NULL, differing from Linux, which returns the pathname allocated by kernel. Therefore, JDroid confirms that the corresponding Node is based not only on the pathnames of the client and the server, but also the sequence of a Node in the list based on its creation time. Then, JDroid associates the taint with the received data. Taking the Java API “read” as an example, JDroid first obtains the taint tag based on the Node, which goes back to the Java layer and was attached to the received data. Figure 9 shows how to Information 2018,9,54 9of16 model the taint propagation operation. InputStream.write(byte[] buffer) IoBridge.read(FileDescriptor fd, byte[]bytes, int byteOffset, int byteCount) int read(FileDescriptor fd, ByteBuffer buffer}{ int taint = native public static int getTaintFile(fd); Taint.addTaintByteArray((byte[])buffer, tag);} private native int getTaintFile(FileDescriptor fd){ int recvfd = fd.getDescriptor(); struct sockaddr_un sock_addr; socklen_t sock_addr_len; getpeername(recvfd,(struct sockaddr*)&sock_addr, &sock_addr_len); //get the path of the sending end int taint=Node*findtag(sock_addr.sun_path); //look for the taint tag Return tag; } Figure 9. “read” taint operation. Figure9.“read”taintoperation. Information 2018, 9, x FOR PEER REVIEW 9 of 15 Information 2018, 9, x FOR PEER REVIEW 9 of 15 Datagram-orientedsockets. Iftheclientused“connect”,theservercanemploy“recv”toreceive Datagram-oriented sockets. If the client used “connect”, the server can employ “recv” to thedata,whichisthesameaswhenusingconnection-basedsockets.Inaddition,fordatagram-oriented receiDvea tathger adma-toar, iewnhteicdh siosc ktheets .s aImf teh ea sc lwiehnet nu suesdin g“c ocnonnencetc”t,i otnh-eb asseerdv esr occakne tse.m Ipnl oayd d“irteiocvn”, ftoor sockets, the server calls “recvfrom” or “recvmsg” to initialize the receive behavior on a socket, as rdeacteaivgera mth-eo rdieantate, dw hsoicchk eitss , ththee sasemrve ear s cwallhse n“ ruecsvinfrgo mco”n noerc t“iorenc-vbmassegd” stooc kientist.i aIlniz ea dtdhiet iornec, efiovre showninFigure10. dbaethaagvriaomr -oonr iae nsotecdk est,o acsk eshtso, wthne i ns Ferigvuerr e c1a0l.l s “recvfrom” or “recvmsg” to initialize the receive behavior on a socket, as shown in Figure 10. ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); ssssiizzee__tt rreeccvvfmrsogm((iinntt ssoocckkffdd,, svtoriudc t* bmusfg,h dsri z*em_stg ,l einn,t ifnlta gfsl)a;gs, struct sockaddr *src_addr, socklen_t *addrlen); ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags); FFigiguurree1 100..T Thhee“ “rreeccvvfrfroomm””a anndd“ “rreeccvvmmssgg””m meeththooddss.. Figure 10. The “recvfrom” and “recvmsg” methods. Since the server cannot exploit “getpeername” to get the client’s pathname, JDroid parses the funcSStiiionnncce ea rtthgheue msseeerrnvvteesrr t coca aonnbnntoaotitn ee xtxhppell oosirittc “n“gagemettpep eeaeenrrdnn aaummseees”” t httooa tgg teeott lttohhoeek c cflloiieern nttht’’ses pNpaaottdhhenn.a aTmmheee,n, JJ,D DJDrroorioiddid pp aaasrrsssoeecssi tathhteees ffutuhnnecc tttiaiooinnnt a awrrggiutuhmm tehennet tsrse ttcooe oiovbbettadai inbnu tthfhfeee rss rruccnsnianammg eseh aaanndddo uuwsse emss tethhmaaott rttoyo laloonoodkk r ffeoogrri tsthhteeer NsN. ooOddneel..y TT ihhnee ntnh,,i JJsDD wrrooaiiydd c aaassnsso oJcDciiaartoteeisds tthchoeen tttaaiininnutte ww toiitt hthr tathhceke rtrehececee diivvaeetdad. bbuuffffeerr uussiinngg sshhaaddooww mmeemmoorryy aanndd rreeggiisstteerrss.. OOnnllyy iinn tthhiiss wwaayy ccaann JJDDrrooiidd ccoonnttiinnFuuoeer t“toore ttrcravacfcrkko tmthhe”e, d dsaartctaa_..a ddr represents the srcname, while for “recvmsg”, msg.msg_name specifies the sFFroconrr a““mrreeecc.v vIfffrr otohmme ””s,r, scsr_rcac__daadddrddorrrm rreesppgrr.emessesengn_ttssn tathhmeee ss rarcrcnena aNmmUee,L, wLw,h htihilleee fsfooorcr k““erretescc vavmrmes sigdg””e,,n mmticssgagl.. mmtoss gcgo__nnnnaaemmcetee sdspp seeoccciifikfieeestss tthahneed ss rrtccanninaamtm pee.r. oIfIp ftahtgeha estrisocr_nca _dpadrdordocrerdmoursrgme. msisgs g.tm_hnesa gms_anema amer eea Nsa rUweLiNtLh,U tahL eLc os,onthcnkeeecsttseo dcak reseo tisdckeaenrtet.i ciTadale ktnoitn icgcoa ntlhnteeo c“tcerodenc snvoefcrckoteemdts” sa(onwcdkit ehttaosiunatnt adp NtraoUipnLatLgpa srtoricop_naa gdpadrtioro canerdgpuurormec eeidnsut )tr hfeoeir s setaxhmaemes apamlse e, wtahsiethw h iaatnh dcaolencron inesnc stehecdotew dsnos cioknce kFte.i gtT.uaTrkaeki 1ni1ng.g ththee ““rreeccvvffrroomm”” ((wwiitthhoouutt aa NNUULLLL ssrrcc__aaddddrr aarrgguummeenntt)) ffoorr eexxaammppllee,, tthhee hhaannddlleerr isis sshhoowwnn inin FFigiguurere 1111. . ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); s{s ize_t recvfrom(int sockfd, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); { int taint=Node *findtag(src_addr.sun_path); itnati ntta_ipnati=rN ondeew T*afiinntd t(abgu(f,s rtca_iandtd)r;.sun_path); ttaaiinntt_Mpaapi.ri nnseewrTta(innetw T(abiunft,) ;taint); } taintMap.insert(newTaint); } Figure 11. Taint operation for the “recvfrom” method. FFiigguurree 1111.. TTaaiinntt ooppeerraattiioonn ffoorr tthhee ““rreeccvvffrroomm”” mmeetthhoodd.. 4. Experiments 4. Experiments In our experiments, we first used a simple tool, Monkeyrunner, to generate random input to test 2In30 o0u arp epxsp. eHroimweenvtesr, , wbeec faiursste uMseodn kae ysirmunpnlee rt omoilg, hMt onnokt etyrirgugnenr earn, taop pge’sn meraaltiec iroaunsd boemh aivnipour,t wtoe tfeosut n2d30 t0h aatp pthse. HEcohwoe.avpekr, lbeeackasu pseri Mvaoten kdeaytaru bnyn feor lmloiwgihntg n Cota tsrei g3g. eFrr oanm a ap pp’oss mitiavleic aiospues cbt,e hoauvr iroers, uwltes fdoeumndo ntshtarta tteh eth Eact hmo.aalpwka rleea dkesv perloivpaetres ddaot an obty pfaoyll omwuicnhg aCttaesnet i3o. nF rtoo mex ap lpooitsiintigv Ue nasixp edcot,m oauinr rseoscukletst. dMemoroenosvterra,t we teh uast emda tlwwoa rper odoefv-eolfo-cpoenrsc edpot n(Poot Cp)a ayp mpsu c(ohn aet teeancthio fno rt oC eaxsepsl o1i tainndg U2)n tiox fduortmhaerin e vsoaclukaette. MJDorroeiodv’se r,a wbieli tuys edto t wtroa cpkr ooinff-oorf-mcoanticoenp t f(lPoowCs) athprposu (gohn e Ueancixh fdoro mCaasines 1so acnkde t2s). toF ifnuarltlhye, r wevea luusaeted JCDarfofiedin’se Mabairlkit tyo etov alturaactke JDinrfooirdm’sa otivoenr hfelaodw. s through Unix domain sockets. Finally, we used CaffeTinheeM exaprke rtiom eevnatlsu wateer JeD preorifdo’rsm oevde rihne aa dv.i rtual machine with 4 GB memory running Ubuntu. The hostT whaes e axnp eInritmele(Rnt)s C woerere(T pMer)f io7r mruendn iinn ga @v i2r.t6u aGl HmZa cwhiitnhe 1w6 iGthB 4 o Gf BR AmMem. ory running Ubuntu. The host was an Intel(R) Core(TM) i7 running @ 2.6 GHZ with 16 GB of RAM. 4.1. PoC of Case 1 in Information Leakage 4.1. PoC of Case 1 in Information Leakage In this PoC, the app uses Java code to accomplish the entire transmission process based on the SOCIKn_ tShTisR EPAoCM, tshoec akpetp tuyspees. JTahvea Jcaovdae ctood aec cfoirmstp olibstha itnhse tehnet idree vtriacen’ssm IMissEioI nw pitrho ctehses tbaainset dt aogn (tih.ee., S0Ox4C0K0)_ SaTnRdE tAraMn ssmocitkse ti t tytop et.h Te hsee rJvavera fcroodme ftihrset colbietnaitn. sW thhee nd ethveic ec’lsi eInMt EbIe gwinitsh ttoh ei mtapinlet mtaegn t( i.teh.e, 0sxe4n0d0i)n ga nbdeh tarvainosrm, JiDtsr oiti dt ocr ethatee ss earnv eenr trfryo imn ththee N colideen tli. sWt thhaetn r etchoer dclsi ethnet tbaeingti ntas gt oan idm tphlee cmlieenntt athned sseenrdveinr gp batehhnaavmioers, .J DBerofoidre c rreecaeteivs ianng etnhter yd ainta t,h teh eN soedrev elirs itn tvhoatk reesc “ogrdetsp teheer ntaaimnte t”a ga nadn d“ gthetes colcikennta maned” stehrrvoeurg pha tJhNnIa wmheisc.h B ethfoerne lroeockesiv fionrg tthhee cdoartrae, stphoen sdeirnvge rN inovdoekaes sd “igsceutpsseeedrn eaamrleie”r a. nWdh “egne ttshoec kNnoadmee i”s tfhoruonudg,h J DJNroI iwd hinicvho kthese n“ Tloaoinkts. afdodr Tthaien ctBoyrrteesAprornayd”in tgo Nadodd etahse dtiasicnut stsoe dth eea rrleiceeri.v Wedh edna ttah. eT Nheo dmea iins ffouunncdti,o JnDs rioni dth ien ivnofkoersm “aTtiaoinn tf.laodwd TidaeinnttBifyietedA bryra JyD”r otoid a adrde sthheo wtanin itn toF igthuer er e1c2e. iFvienda ldlya,t ath. eT hreec meivaeind fduantcat iiosn sse innt tthoe tihnef osrpmecaitfiioend fsloerwv eird. eJnDtirfoieidd nbyo tJ Donrolyid a dardes sah otawinnt inta gF igtou rteh e1 2r.e Fceinivaelldy ,d tahtea rbeucet ivalesdo dtraatcak iss tsheen itn tfoo rtmhea tsiopne cfilfoiewd usnetrivl eitr .r eJDacrhoeids tnhoe ts ionnkl.y adds a taint tag to the received data but also tracks the information flow until it reaches the sink. Information 2018,9,54 10of16 4. Experiments In our experiments, we first used a simple tool, Monkeyrunner, to generate random input to test 2300 apps. However, because Monkeyrunner might not trigger an app’s malicious behavior, we found that the Echo.apk leaks private data by following Case 3. From a positive aspect, our resultsdemonstratethatmalwaredevelopersdonotpaymuchattentiontoexploitingUnixdomain socket. Moreover,weusedtwoproof-of-concept(PoC)apps(oneeachforCases1and2)tofurther evaluateJDroid’sabilitytotrackinformationflowsthroughUnixdomainsockets. Finally,weused CaffeineMarktoevaluateJDroid’soverhead. The experiments were performed in a virtual machine with 4 GB memory running Ubuntu. ThehostwasanIntel(R)Core(TM)[email protected]. 4.1. PoCofCase1inInformationLeakage InthisPoC,theappusesJavacodetoaccomplishtheentiretransmissionprocessbasedonthe SOCK_STREAMsockettype.TheJavacodefirstobtainsthedevice’sIMEIwiththetainttag(i.e.,0x400) andtransmitsittotheserverfromtheclient.Whentheclientbeginstoimplementthesendingbehavior, JDroidcreatesanentryintheNodelistthatrecordsthetainttagandtheclientandserverpathnames. Beforereceivingthedata,theserverinvokes“getpeername”and“getsockname”throughJNIwhich thenlooksforthecorrespondingNodeasdiscussedearlier. WhentheNodeisfound,JDroidinvokes “Taint.addTaintByteArray”toaddthetainttothereceiveddata. Themainfunctionsintheinformation flowidentifiedbyJDroidareshowninFigure12. Finally,thereceiveddataissenttothespecified server. JDroidnotonlyaddsatainttagtothereceiveddatabutalsotrackstheinformationflowuntilit rIenafocrhmeastiotnh 2e01s8in, 9k, .x FOR PEER REVIEW 10 of 15 Java layer attach 0x400 to the received data Java code Client LocalSocket Server Java code attach the taint tag 0x400 create inquire Note: the data related to privacy data to IMEI has leaked Native layer srcname:/dev/socket/sender dstname:/dev/socket/receiver taint:0x400 FFiigguurree1 122..P PooCCf foorrC Caassee1 1.. 44.2.2.. PPooCC ooff CCaassee 22 iinn IInnffoorrmmaattiioonnL Leeaakkaaggee IInn tthhiiss PPooCC,, tthhee cclliieenntt aanndd tthhee sseerrvveerr aarree bbootthh llooccaatteedd iinn tthhee nnaattiivvee llaayyeerr aanndd bbaasseedd oonn tthhee SSOOCCKK__DDGGRRAAMMs osockcketett ytyppe.eS. iSmimilailrarto toth teheP oPCoCfo froCr aCsaes1e, 1th, tishiPso PCofiCr sfitrfset tfcehtecshepsr ipvraitveadtea tdaarteal arteeldatetod tthoe tIhCeC IICDCwIDit hwtihteh ttahine tttaaingt0 txa1g0 000x,1w00h0ic, hwihsitchhe nist rthanensm tritatnesdmtoitttehde ntoa ttihvee lnaayteivreth lraoyuegr hthJNroIu.gJDh rJoNidI. tJrDacreosidth teraicnefso rtmhea tiinofnorflmoawtiothnr ofluogwh tJhNrIouangdh lJoNcIa taensdth leocmateems othrye amte0mx4oar9y8 ca9t d04x4aas9so8cci9adt4e daswsoitchiattheed twaiinttht tahge0 txa1in00t 0ta.gT h0exn1,0t0h0e. Tclhieennt, ttrhaen cslmieintst tthraendsamtaitsb athseed doantat hbeasmede mono rtyhea tm0xem4ao9r8yc 9adt 40txo4ath9e8cs9edrv4 etro uthsien gsetrhveern uastiivneg ltihber anrayticvael lli“bsreanrdy tcoa”l.l “Bsyenhdotook”i.n Bgy“ hsoenodkitnog” ,“JsDenroditdo”c,o JnDsrtrouidct csoansNtroudcetsb ay Npoardsein bgy tphaer“sisnegn dthtoe” “pseanradmtoe”t eprasr(aim.e.e,tberusf (ain.ed., dbeusft _aanddd dr)estot_apdodpru)l atote pthopeutalaintet tahned tadisnttn aamnde danstdnainmveo kaensd “ignevtosokceks n“agmetes”octkonoabmtaei”n ttoh oebstracinna tmhee .srAcfntaemrteh. eAsfetervr etrhec aslelsrv“erre ccvalflrso “mre”ctvofrroemce”iv teo trheecedivatea t,hJeD droaitda, rJeDtrrioeivde sretthreiesvrecsn tahme esrbcynapmares ibnyg pthaersfiinftgh tphae rfaifmthe tpearroafm“erteecrv ofrfo “mre”c(vif.ero.,msr”c _(ia.ed.d, rs)r,ca_nadddlro)o, kasndfo lrotohkes Nfoord ethceo rNreospdoen dcoinrrgetsopsorncdnianmge taon dsrictsnaomwen paantdh niatms eow(i.ne. ,pdastthnnamame)ea n(id.ea.,s sdosctinaatemseth) eatnadin ta“s0sox1ci0a0t0e”s wthitehtatihnet“re0cxe1i0v0e0d”b wuiftfher thmee rmecoeriyvaetd0 bxu4faf9e8rc m93ecm. Foirnya allty 0,txh4aat9d8ca9ta3cr.e Facinhaelsly“,s ethnadtt do”a.taS irnecaech“eses n“dsteon”ditsoa”. sSininkc,eJ D“rsoeinddntoo”ti cies sat hsaintkth, eJDICroCiIdD n-roetliacteesd tdhaatta thhaes IlCeaCkIeDd-raefltaetrecdh edcaktian ghatsh elepaakreadm aeftteerrs ,cahseschkoinwgn tihne Fpigaruarme1e3te.rs, as shown in Figure 13. Java layer Attach the taint tag (0x1000) to privacy data Java code native layer JNI taint map taint map 0x4a98c9d4 0x1000 0x4a98c93c 0x1000 POSIX Client LocalSocket Server Socket API create inquire Node srcname:/data/data/com.example.test/sender Note that the information dstname:/data/data/com.example.test/receiver related to ICCID has leaked. taint:0x1000 Figure 13. PoC for Case 2. 4.3. Echo JDroid discovered that Echo may send private data related to IMEI to the specified server. The client and the server both use the SOCK_STREAM socket type and the FILESYSTEM namespace. Figure 14, which is an example of Case 3, shows the major functions in the information flow identified by JDroid. First, Java code invokes an Android API (i.e., getDeviceId) to obtain the IMEI that will be sent by the client in the Java layer. This type of parameter is a byte array and its taint tag is “0x400”. Before sending, the instrumented code invokes “getsockname” and “getpeername” to