Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant call of prepareClientToWrite when call addReply* continuously. #670

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from

Conversation

lipzhu
Copy link
Contributor

@lipzhu lipzhu commented Jun 19, 2024

Description

While exploring hotspots with profiling some benchmark workloads, we noticed the high cycles ratio of prepareClientToWrite, taking about 9% of the CPU of smembers, lrange commands. After deep dive the code logic, we thought we can gain the performance by reducing the redundant call of prepareClientToWrite when call addReply* continuously.

For example: In https://github.com/valkey-io/valkey/blob/unstable/src/networking.c#L1080-L1082, prepareClientToWrite is called three times in a row.

Profiling of Hotspots

image

Test Environment

  • OPERATING SYSTEM: CentOS Stream 9
  • Kernel: 6.2.0
  • PROCESSOR: Intel Xeon Platinum 8380
  • Server and Client in same socket.

Performance Result

QPS can increase by up to ~7%.

Test Scenario

1. Start server

taskset -c 0-3 ~/valkey/src/valkey-server /tmp/valkey.conf

port 9001
bind * -::*
daemonize no
protected-mode no
save ""

2. Prepare test data

redis-cli -h 127.0.0.1 -p 9001 "SADD" "set:1000" "tocawprsxz" "csqcfxyzsy" "ipubipttcb" "jqurtadjro" "zdulfflfqx" "bbfvuialin" "ifpfyncdfe" "kjeevccyof" "stttkrbfqs" "qatkvfuttq" "ltvfnuuwil" "znartcywze" "bzursuzuei" "jlghqxtvok" "osncqcuest" "uxvibjduto" "giubyhvaav" "joijmgposs" "lqxpnplleo" "bnatichltp" "nnfxoqebys" "lyhqvxolfw" "caaqjozcqh" "nlqtadqinl" "hfjxrrsszf" "fussukqrph" "cbjovvgqqy" "wcpbfslakk" "slskrnekbc" "nvonwipkta" "jhdcicttmm" "rpyroriegs" "lnuuootxmp" "ufdxqlonpg" "qgasrnjrld" "jhocasnttw" "smwbxeqbed" "kapxglqccs" "hhwvslfxmf" "rbdabbmnrf" "ltgidziwzm" "qpffifpdiz" "sadqcfniau" "bqoabrqwwj" "wghyakzbit" "bnxtlmiwup" "akzsgaeqon" "zwlhpcahwu" "kfselnpkim" "dxtzxeguoi" "roczxpuisd" "myzjxvtvjh" "alfftdxaxc" "vyiqkfoics" "dygkzcpakt" "ukprrucywq" "afzlyodwiz" "vdaebbupfe" "wemmvswznk" "xzbqjpzqlm" "lqqivzlppd" "rrzcqyzdzf" "ncckxlmsvg" "cpjveufsvk" "babfjxxabw" "btcvhacldb" "mqqrgbacfa" "eqaxrccwjq" "erahoeivfw" "omeatkwwtc" "mjwrbndexu" "gktcjcfxbb" "tfonhwnuxj" "pheajlhymx" "vefgwelnfo" "gayysuldha" "tqpqihwjtl" "eirhwkdgfq" "rnfodijavx" "erqgyscser" "nnnxouavyp" "yhejmjwwni" "mwmcwqzbld" "ofcurtthcs" "idmjjbjvni" "ovqohpxjft" "ocoflktdhp" "kgopxvsdah" "pyjpxqnavq" "nubsytpfao" "ddpgrvwowd" "glynpmsjcf" "whsxmqffqg" "sstqpivwip" "cqfnhujrbj" "gsvkmnluiz" "zdmgjjyukl" "gcfcbjybkx" "qmhyoyctod" "kdodndexvr" "tkgihmsrha" "kmifjielrw" "gefoharnza" "tcpwkimype" "nxllkzroin" "bpvbnmpekh" "ljinllovsw" "mugdxqnjxj" "tqqmmvwact" "uidvmrqyjd" "dthtfrqkce" "efhynoxlul" "iosqxoobrk" "sujbwndgwl" "btxehrokkw" "pmaagvqldo" "evuxmkrrfl" "dclualrzqb" "jfqxkxgqhj" "fvemodlpgz" "lawrpikwsk" "socoxaegfa" "snomfrutha" "yxsnreuepl" "vjihaakiof" "nnhrgirrtw" "jccorylnjg" "gehuriygwq" "icqjxcermo" "ocgjeuljxf" "qslrwqmixc" "rhzpguhsws" "zxlbhyeckf" "iziwqojsoq" "qlorevyltp" "gbjzsedhag" "mkxysrkpug" "bhrvnadcdk" "qxxinxaqxn" "ctnaggdbru" "fsthobmdxk" "cvnnitrrow" "vlhtdpqavh" "vhjaphfdpj" "yhdgqenmwv" "ysntbzffxq" "emfjcnujqn" "qnqzibcmip" "ngcxqjjpdm" "gkneclxnnt" "rhkpfsuhoq" "dgodkfjzos" "isqymcuffe" "ripecixnpr" "dxpepbctea" "gbeizdzdyb" "aqlapmghln" "yhlalzusch" "xglrugpjkt" "ngwifjdpha" "jvekvvldai" "hmdzsuuyrn" "ncabqesziv" "whdftyqojv" "rhzqdtxucc" "ftyxhyfokj" "vqtixjkcbb" "krfosgrmjb" "ahcaaodvgi" "ooeswhfdnj" "jhctncrzlw" "haxesjafmh" "vxrtzngznb" "fidsuuizcf" "mamtueyrqn" "quemrlmwod" "pkgpjwyfbh" "ckibsdtfff" "tjnjhejnju" "puvgjfjyaf" "cvmicoarvv" "mxpzuzrzuo" "rrrfhnclbv" "xeurpmfdmo" "yqvkykgjbe" "behdxlfdho" "dyzedskzkq" "rfhlttsuqy" "pkehotsmka" "alokvrpbih" "mobwpcyxuk" "umwunfzsvo" "naggqdxcjm" "rakustfykw" "dtkfydidli" "kohpozxkhl" "usjmfkopln" "axhoxkubdv" "asretszbav" "tmkoxwdgpx" "wjhaavxfge" "pcuaesomdc" "vjhpmffzxc" "qwxzqlqter" "jjumoixniz" "ruxsmttpak" "pjdundsxrd" "kdklhpxntt" "muhewfzihs" "dplonqlliz" "wjibkklezg" "dfemamyevk" "nryvfijxhj" "bqqohkuylc" "wiqhmhkiel" "lftmqoxhfc" "sjbaedopjb" "dlomhvkoxg" "jhkdwtqvwl" "vqashxkrik" "mupcilqfjg" "suahxaebee" "rqkcyxiwhz" "jqgtbgbybq" "ygbfgfefac" "kjblkrvknt" "yajpmxmuwz" "wwowdvybjj" "bdtbaxnuko" "adfhfatarh" "vfcpevtekf" "fiugzrozky" "spogjykkfs" "tdggmsxysk" "aoqlctikzg" "nwywtydqew" "qjrhtqgwjc" "dhzgpwewsx" "outdlyeqvq" "trwzipsers" "qtpcwuafar" "scgjdkyetq" "aqyfvxyjqr" "xkvgnzjgrm" "hhbceuegvh" "paitaeqrpb" "yfdsmhtria" "bxwvqvndcc" "dpyjoihqrs" "tnratexlre" "hplvvuoscb" "muocqqypmt" "pxzkuasjek" "flrsaczxzc" "pubqtzzzko" "vpqlxtfkjz" "fiafoggekm" "qtewhixedb" "iijjcabgak" "tqjpijliii" "uttazeawix" "hxbmykkugi" "bekchebgys" "ffrviosqzo" "rjrptuhkav" "sldzewoxas" "uesalivsis" "maxylirjgh" "vpzsmbjkvy" "eiziligjfr" "tqblforkpa" "nszbrpweoz" "rzanpefsfy" "cejkfhuykf" "abinkgshoi" "gqybtjuhvq" "oqdlpaubsc" "nrbfkysxaf" "mhxojehvxx" "vuqlqdpfdn" "orqqevpmca" "xigznrdgqy" "jzceexkqam" "szupcnvvij" "btgeubdzbb" "nojnedgabk" "sdnkjddyut" "lbjarnpxhh" "wevfinjbqk" "dvgqwzignk" "ejzwnidqwr" "nlxwjmzwln" "brrlblrxwa" "hyikggurti" "wybmlpqblt" "hertbwuzyw" "rwhzzytdsq" "symbgeyple" "zbfeyptemz" "pghbwbtfmk" "mxydilgynv" "bhwytqsafu" "ecsburyjhh" "cvohdragtx" "lscjhgztom" "giswndixdf" "etsngvbrff" "lgqazzajpx" "pypepewjvq" "nswjopvtqv" "tuajnnqtcq" "bvvoibkfrt" "kjqeujfkoh" "diwmfuckel" "bwizktcwmb" "ughnpilqqm" "ihealvwnxb" "thqttakyzy" "auwfujaoya" "rofnkytnhm" "ilkuddrdvh" "hmwfncgzxg" "pzrchtwaaw" "ffksbrtbfq" "ethxaycsil" "uwiqrvcqvu" "fgcehqgsso" "yoblelzlkd" "gjiwldcfqh" "sbrjnwxdip" "nenhiiibwx" "ebhhhgabjd" "xpkwqbfban" "pupmdjgyed" "aejnvyfdst" "krxneqolle" "nouncgkoik" "kamgfgbxel" "fffylsswky" "agswwrfabr" "pkvcbelpos" "mxapzqqqsw" "ywmqoaztmy" "sfuvzzxbxq" "kdcvbkrbsj" "twpiiaedpc" "egmgddriry" "nmfihtnkel" "kqzjnkdlxd" "eovsizpcjp" "bsavjyaksg" "xlmvatfsly" "dlhjfafskj" "wmvhvwnowp" "vjjozwrovk" "gbazuqnmit" "ubwlcefgqb" "jttqzbazgz" "dozecfsvue" "pgdhjrxhga" "gzekysdunp" "ygoiannoht" "hklchdenoe" "sotbjzlsvz" "qjwrnhooax" "cdghgcsoth" "mjlpvuoghe" "qclkaeciey" "oownjpxrov" "nvqfyljbef" "tsnawydcru" "wrrgxxkxkc" "ylulwsnjay" "lxsinouutc" "ozpyyaznsh" "cmhkstsjok" "ybckvbeoib" "fsoardckcw" "ltkauvxggz" "sqwhsgboef" "wgtjxahmef" "spoqshzjoi" "pfvfxrrfhl" "nahweurftw" "fojjpqmbck" "zexblqeora" "qsoiwsugdv" "ksppwhhqzj" "otadcihtmd" "imnjbkmsls" "zzenkvuesw" "kbfqdppnfa" "igehetokzq" "koujdppfua" "wqsqzzbqhm" "tglieutcis" "owovlhorvw" "nraylduhut" "nwnyjkugcf" "kpfqxroqbs" "xwxwosqkhm" "ollacusjzj" "wcouaiatsu" "nvkfnfzoki" "fgjnsosfrp" "pltsnzqvpi" "rhnkzlsjtk" "ysnndkycix" "bpnfopinub" "blujwnyluy" "wgtmckqknh" "zorzyqtjtr" "hvtlkrungk" "rgtondctpo" "mjgvtydjtm" "kcbotffyca" "gybxnvwchp" "gazojexans" "hmcpcrjumm" "zejhycldyy" "iiissmznfe" "qvpuudyuks" "gviypfayfm" "plqbwsiuzw" "nunchscyqc" "qocjpufxio" "iqbyikqjmx" "omwbgglqsp" "nywteueaig" "ntmgbzaivy" "ijdgnlzprg" "rnlaakgsrf" "fpdflprzvn" "azkdbpnshy" "mvfnirshbd" "sotsxznskx" "uzktwqcdeb" "myrrmvflyw" "jgaieawkcu" "utymwhxigo" "vtaiyncmyg" "gpodilvrnm" "xgfzndhodu" "saqilljaid" "jxiewthqls" "nbwksmwxpx" "rwfykeeqgx" "tlnkrncpwi" "ogyvxbgcwi" "ffcqkkzllx" "rtnhivnxtb" "vzcclamtun" "jjlefkekuw" "xjksnqifds" "ctusqixohm" "osaekeukqx" "irlduoinie" "nifzrybfuh" "ctqxoyxbwc" "vsvhjrymqc" "bzwxqcpftf" "ltghdkluqq" "vklwhyzqhk" "ghwcrdlbjj" "lzzptujbjp" "qlvgfplbod" "ghepftfjgk" "aiqqyusnuv" "rspghuhpbp" "lfkqrtxocm" "iibgagtkpg" "ywiurvfbpg" "tdceweesxh" "pvwvdaorrl" "ejlunxlwxn" "ymqxhmnidz" "lydebbpmfb" "ztjuqomjck" "eyrbqexkff" "oqmuhlruqy" "gnrmnwaxls" "mumhqarhgg" "skbzfbeziu" "hnnfmyurhx" "yrsizkbbwz" "azpwrzovza" "txhllnvudv" "aslibwggrp" "ubghghklvj" "jqqogagqni" "emfqsjraia" "ctgwmawlgl" "mivctgaajt" "knycrcrsbm" "ubtiscdgrn" "ulepgommyy" "qbhdjhoohc" "cctlfgicpv" "phfuspevwk" "oeawjlqnyg" "jpphbjtbrh" "ofykgotycd" "csjfbpjyzq" "thmmmlqluk" "buzhjxsbkm" "pisgqibyae" "skkawcmqqt" "mmqblvrscy" "dpkiubfzbx" "yivxcecwlp" "kbnjiilaqd" "rwrxxrnwtq" "veegnotgmj" "pbfijwccjp" "expefhkisx" "ynnhyctikq" "bhfmhanvxe" "otclvmbilg" "hskkmrluuf" "ftnbjymlll" "nbkaxrojqq" "qydrgilxxt" "dxufcyurjx" "fgygwdazbm" "tivnqailcl" "jwvqixjhho" "oglqutqfcx" "wvrlxfoxff" "ropuqidkxv" "qcsxjrjcfc" "twuvkpjzzw" "fqtktfghcv" "suhwnartid" "wvsnfinuil" "rngtndwjyg" "tsmzfswaxo" "uvlswctlhx" "llamjvxyqo" "wovoupawzt" "caxgjftjyj" "gwzqcetcji" "yzrdbalexf" "fnpdsuozxt" "dbtbtvkqss" "pwgjoppmgc" "wxjdgbugeu" "qchpfcigwa" "lxzdcbveuy" "bwjyghaztz" "uedehyieof" "pfaytznuaa" "lspvrnxnjo" "zkbqvttlzy" "fkdmuxraqf" "nbizrabfuo" "fgzwwaedjy" "gkmwutvars" "bwsdzrxzse" "txgjxzovte" "cbtpbbfrdd" "vqgztpmzhz" "rdipvyeqoi" "bovkdabcdo" "fhobhpwwkp" "mkbkflixkr" "mjifqzmtsd" "pkcqdokojd" "dtgjnddwch" "uboipezuni" "dfdodbelzn" "fzsoiryhfn" "krtsiucvvu" "aieekmivcb" "aeafusfzdn" "ehnrizfmfo" "dcjlwhfstw" "wksgvbkbyw" "hvfprkjlbc" "jlgepeyhpc" "ljklggibcy" "mhrvuemywb" "wdqygrxkya" "ystnkbogee" "flvftlpbjq" "vgfgbsbnwy" "rsivptwulz" "bzjzucrypq" "bweysooxiv" "mmcunsiwad" "mszjkgsrio" "bvurseeqmh" "wtcpliaxmk" "ndwiompimr" "mdcwoblmkl" "dflxukffgl" "mcojdazpfq" "tctgzmjads" "dewdgfrhos" "iwqanwtvcd" "nfucelqjfe" "wgtrwefdsw" "skstqdgbos" "rwllkdzxrj" "qwozutlufu" "fmpdixcckx" "jybzltmwrs" "ossjrvqmaa" "adlxahxsbq" "mbewprqunw" "xbvbujurqw" "rnvhfxbuoi" "pyrpwxalpc" "adlryhdbpr" "gritvkzfgw" "aufhfrhccf" "umoicweaab" "kgirldeylz" "nknlysgviv" "plbxaamppj" "ikpikupjoi" "eioxaswdee" "imexfccbxk" "ouroipthpq" "jbzyfznpdn" "asidljmwgb" "jeazfmhrcb" "dablvesuho" "zuoqjiciij" "qmxxfyuodo" "vkqalcokst" "jhibapuhga" "cmqraybrlw" "beqsnrixhl" "rmqxtqcxua" "ndltyojjxj" "hyanpicfan" "yzutuazhmh" "tumnalubch" "eksvvoxziw" "weqhfkosif" "wwfbpjatrp" "lrhrkuyzry" "uvbtcgtopw" "fmyleefltp" "kkrxiaiife" "gbkqhfumyu" "tdmjyuitvv" "jvtalmlkng" "rdsfcdvkqz" "xqvjnlpssl" "fuftndsnim" "keklddczkd" "wrqnytptzm" "rwzijctxzs" "btakuczlec" "fuipidfbjt" "kjiqagynco" "ahjawbsqcw" "iehxaaneev" "ezbiwqnabg" "pnnzqcutoq" "wlogkzxkpo" "xzswnnldvs" "qqfnugftmr" "zuccleayil" "ckqebhazel" "brwlqbfoat" "anmcogawkg" "roqzbzpbbt" "dxnprfawun" "fffreqppjj" "gfdzgxfdcg" "sshbuxfljd" "shckmujxzo" "rqurawzebz" "vpehhmoxva" "vldwfdnicm" "tzhjrlfvfp" "ymwwctfodg" "qsxfnsicrx" "gfhrrjczsp" "gtqrsktbaa" "dniplpxfof" "htawohddyn" "dbcxnsiacw" "dhfundvlpn" "uewpgskfpu" "cuuytorpnp" "vlcnbfqvox" "jbqibabrmv" "xhspgwheck" "fsuovvpgng" "gcjruttnno" "wxswusqpeo" "qhhhipzncq" "mcbuftndrr" "owjfgjqqjc" "vvmkjgajwa" "wvlvshnhmx" "ekponflaeq" "kuiumwomxi" "aoydkdfrpe" "cglxptkcsz" "uqbpcvkipa" "ubzgvzputq" "wmyphdckda" "ukdnaklmcp" "ramoirrdyd" "vwayaqmtid" "ltomuspfzc" "wzxdkpehwf" "yzcspfvcot" "cgpvvnbvlk" "farwqgfyjf" "lbxvlwzony" "ocesqguvym" "yzviqaobku" "cnngbbpowp" "ucxeoqcssr" "zcffhzusrl" "yzmodbpsnb" "aryiyaltqw" "xkaailrpns" "lpahctqgna" "cnbqnvxmjp" "nugjvhftma" "xsgcuvxzor" "xwtwtwmbgu" "emdwpvauyc" "ahfktrqmgh" "jznackjcrd" "etcsjxoqab" "kpzmuwqbnt" "dspznsgszk" "rcwbzvwbva" "mlznoaajqq" "iwuuxdactm" "zujobawsct" "snepgcispg" "cgmivhyskk" "snunzlgfkd" "ppdxnadmje" "wtzqqecgfy" "ncremxgfdb" "cblsafugqk" "hjekcxfyds" "faxedqgskm" "jjczogqdwz" "jfbgmhtjke" "nehqnkqnld" "lcdchjadll" "llimzyabsp" "iwapedwyle" "iobkwbwceu" "twmbtaxdro" "nmtmjmhmdl" "ewoqykjbkc" "tmyuncyoyd" "dcepfcdddn" "dnvwyhyhsn" "nrencopzqn" "yjyffpgoop" "uvqtefqdhk" "yjhypaonqq" "uqvzpcvugl" "cakvxrdpmj" "tvzacklhdz" "higdkhodzy" "ormdblyhhn" "wbouqpojzl" "eyhgspybnr" "lywsezpzgf" "usykkwszvh" "bcwncpnibg" "jgcqryhsvk" "yfvwesgulw" "geizujxrkg" "zknlteeaxq" "nqwjivcosg" "qmnxipsiga" "pthacnunjj" "afamsavgsi" "bzfzxzecrs" "sxcihybfci" "padscbypdo" "gaotvjctjh" "beicnwdryg" "xsueeljljp" "mkrrypcfzy" "ekjgqnjxyl" "iyeiercbxr" "rkwlgzhvvy" "hmnaoeeasz" "aquymkrswt" "ulnnuwyptq" "xftfzsoiwc" "urkkyscfti" "wabroeeoop" "qpzkuxsipr" "dxdngrmypg" "icatrrbcjs" "fhuptkhkzm" "apyzwvajot" "vealtjlqyc" "khkkfmzkow" "trzqdcaqdw" "itmekixthv" "pudgkcbwdx" "zuibhuihtz" "kzuywkxlku" "ogtqmpnzie" "jetamrlglx" "fjdjumschq" "kprzbyngsw" "xeyxlxiqch" "dtuhvpszzt" "fpmbbgiaao" "hjlhurakwh" "mshexjmkmn" "cynhehkcxs" "cvbbbdzmie" "cvnlzjdfgf" "ifhkjgmxrd" "audguegpmo" "jzstgleeby" "eafrzhdhhq" "fmmammvdyj" "uncqdpbhwb" "fzatoyblsr" "xtwlklqdna" "ydqppngxvh" "mkngszsxeu" "vyewicgjio" "tstbluhyhj" "qzxxwlfeki" "ocmtsfpsgh" "xmknbbmdbf" "pdjmftsmob" "ygrpkpstxq" "hrhiqcarju" "aadzbodres" "curhymvwsx" "tbqidtevrl" "avchkjnlwm" "tyephutkmb" "lxoaezrdxs" "ctkwlhmgfz" "xkiuuciwrn" "irrovfyshb" "hwuofuftlr" "mhbfsuaovv" "wzuhzzdezi" "jlpobgvouj" "qbpmtomqpu" "shlwywnxpk" "srkvjhetmj" "hvxefqtmqu" "fazsvkljef" "bstezdkmig" "asbtvfzien" "vewfxcxkpf" "tqkprkoixe" "rcaatkjyur" "euleuicawb" "ifiizdeong" "cjcrpmggtu" "kxggjpatkd" "klwqsggtob" "mnsaklzgob" "xfxlervrgn" "eraxdyjftw" "xrvonyieqa" "fswhywqxhy" "iqzxblqkeo" "rxvhmzvbcv" "wvdmobfisx" "ujybghjfnf" "yufagalzhk" "qxbqbfcgjp" "vorgqhmaoq" "zewylkylsy" "vvmaucizkv" "bgcoyoduda" "vnsufnurol" "rtskokvklv" "svvdufedug" "qgdgujdvtg" "rjrtvpntke" "shgetgsird" "ywgeotcect" "zsikdzycyt" "gcsswbksnc" "qgobfhgspy" "pbxrbaxnor" "viwarrumob" "eaetplspga" "jqmscuprwq" "nkyuframnm" "gygftrsdbm" "qzlfnntjar" "fzzcioobeb" "ydigxptqbl" "bgtxhxkhvv" "hggqmlgwha" "ywlqbjqeug" "qwowxqzrkz" "zybosgbtxt" "cflarkquuv" "klaeknlbrm" "ccnbldglgl" "dpauqcpgyi" "ylxiwiesps" "xyxmlrdbui" "arqfxfqkzh" "byrkeibrfb" "laepwenqmc" "kluswgtjsf" "mgldvzleyy" "yqmzmmzwpd" "tvlckdoyfe" "dmxcbvzrxg" "qquwyuyvvw" "pmytvtksfi" "umttshfkpk" "rmdayyptch" "glwrmjpotx" "bgcnzgcmza" "ivinvxopgz" "dmbarohbfj" "rncdgqxqfq" "zmmwzkjrjl" "gdlztbhpeq" "zrwgpknaop" "powzkcrtvv" "cszvzbrmoy" "dtjljhzqcm" "anznywecwk" "amuwlfaxwv" "ajdkqflpen" "evjrybtwww" "oxsdmrdbit" "yafipxfsip" "xekxarmwcq" "dgcesswkvc" "gdqgmwxkmt" "spdyueanru" "yrvmdhnnfc" "aexxjlgwuo" "xpcpytommm" "gjutzwoxlf" "stnfirydgi" "snpuvnebpy" "rfxibyjmpg" "ortxlvmdoc" "gdozstnglr" "eqiukbyscu" "qzcrpbvatq" "dwzqowbrsd" "iesbitdnjd" "inboyxgoqa" "lfojnetxdc" "njmufqrykx" "ybcdthmgws" "igwekdegcw" "ajkgxmtamu" "qkyfpamste" "nwybjbhgep" "arqqmfmmbz" "rqiyxwpuyv" "nsdvirehqh" "qckueiqiwh" "tjnbsybxws" "jphvxuqipp" "ghtoyhrfxh" "erglflfnql" "kngwkkzwts" "nmguhcygct" "jigyicdeft" "gamcdtywne" "nunpqugdit" "ghqwxaqlef" "nqxdrqigvf" "xepfvvcovk" "ezgxjiwwig" "izizuzzjuv" "mallnshtok" "tctrsxgnrc" "exhjfssojj" "yilvzcevlj" "nepxmyiuhr" "dqqfcdugde" "iamjlqlznh" "mvmsikqfxu" "kmqlwfbsex" "pribqncfuf" "zavrjnezrf" "kmcwshsbye" "uzaejrbwue" "olezxlliej" "hjjxyybxiv"

3. Run benchmark

numactl --cpunodebind 1 memtier_benchmark -s 127.0.0.1 -p 9001 --command="SMEMBERS set:1000"  --hide-histogram --test-time 180

References test-suites

memtier_benchmark-1key-set-1K-elements-smembers.yml
memtier_benchmark-1key-set-100-elements-smembers.yml
memtier_benchmark-1key-list-1K-elements-lrange-all-elements.yml
memtier_benchmark-1key-list-100-elements-lrange-all-elements.yml
memtier_benchmark-1key-list-10-elements-lrange-all-elements.yml

@lipzhu lipzhu force-pushed the redundant_prepareClientToWrite branch from 3ec1437 to 4549e46 Compare June 19, 2024 06:41
Copy link

codecov bot commented Jun 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.04%. Comparing base (5a51bf5) to head (b7b83bd).
Report is 11 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #670      +/-   ##
============================================
- Coverage     70.17%   70.04%   -0.14%     
============================================
  Files           110      110              
  Lines         60077    60089      +12     
============================================
- Hits          42160    42087      -73     
- Misses        17917    18002      +85     
Files Coverage Δ
src/networking.c 85.27% <100.00%> (-0.11%) ⬇️
src/server.h 100.00% <ø> (ø)

... and 13 files with indirect coverage changes

@lipzhu
Copy link
Contributor Author

lipzhu commented Jun 21, 2024

Ping @valkey-io/core-team, could you help to take a look, is this optimization sensible?

@madolson
Copy link
Member

madolson commented Jun 21, 2024

Having a couple of weird optimizations that are dangerous to get wrong (we won't install the correct handler) feels weird. Do we know why that specific function is taking 7% of the CPU? Is there some cache thrashing going on maybe we can fix more generically? I read through the code in prepareClientToWrite and it seemed fairly straight forward.

@PingXie
Copy link
Member

PingXie commented Jun 21, 2024

QPS can increase by up to ~7%.

Is this just an estimate or measured using the same tests? Can you share the row test numbers before and after the change?

Do we know why that specific function is taking 7% of the CPU?

+1. prepareClientToWrite doesn't involve any system calls. Do you mind sharing cache misses metrics (for both data and instruction) for the tests?

src/module.c Outdated
if (prepareClientToWrite(c) != C_OK) return VALKEYMODULE_OK;
addReplyProto(c, "+", 1, 1);
addReplyProto(c, msg, strlen(msg), 1);
addReplyProto(c, "\r\n", 2, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like delegating this internal detail to the caller.

Can we make a variadic version of addReplyProto so these three calls can be replaced with just one that does prepare client just once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will update it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zuiderkwast To keep this patch small and clean, I removed non related logic, only did optimization for addReplyBulkCBuffer which is in the critical path.

  1. Remove addReplyLongLongWithPrefix from header file, it is only used in networking.c.
  2. Use _addReplyToBufferOrList to replace addReplyProto in addReplyLongLongWithPrefix, and move the prepare client logic to parent.

@lipzhu
Copy link
Contributor Author

lipzhu commented Jun 21, 2024

@madolson @PingXie @zuiderkwast Thanks for your review on this.

Having a couple of weird optimizations that are dangerous to get wrong (we won't install the correct handler) feels weird

@madolson I did some check, the write handler should only be installed in the first call of prepareClientToWrite. Am I missing anything?

Do we know why that specific function is taking 7% of the CPU?

Maybe caused by L1 cache miss(see below profiling result).

QPS can increase by up to ~7%.

Is this just an estimate or measured using the same tests? Can you share the row test numbers before and after the change?

@PingXie The 7% improvement is based on this test-suite https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-1key-set-1K-elements-smembers.yml.

Do you mind sharing cache misses metrics (for both data and instruction) for the tests?

I did some profiling for cache missing
perf record -p 'pidof valkey-server' -e mem_load_retired.l1_miss,mem_load_retired.l2_miss,mem_load_retired.l3_miss sleep 10

We can find the the getClientType (inlined) and its associated symbols contribute more than 17% L1 cache miss, but didn't observe the prepareClientToWrite.

image

image

@madolson
Copy link
Member

I did some check, the write handler should only be installed in the first call of prepareClientToWrite. Am I missing anything?

Your code looks correct. However there are other cases, such as hitting the CoB limit, that will cause your implementation to behavior slightly differently.

@madolson
Copy link
Member

We can find the the getClientType (inlined) and its associated symbols contribute more than 17% L1 cache miss, but didn't observe the prepareClientToWrite.

This is interesting. I don't have a dedicated bare metal server to test on, but we are just calling this function a lot.

@madolson
Copy link
Member

madolson commented Jun 21, 2024

Some other observations while looking at the code more:

  1. putClientInPendingWriteQueue doesn't need to be called every time. It could just be called at the end of every command processed. We just need to set a flag that we need to do it.
  2. We do a bunch of tests that are basically immutable for the entire duration of the command. CLIENT_MASTER, CLIENTER_REPLY_OFF, CLIENT_SCRIPT, CLIENT_MODULE never change. Maybe we refactor this into a single flag we check. It should give solid branch prediction while in the inner loop of the command.
  3. We check c->conn, which is in a different cache line. This is also an immutable property and we should only need to check this once. Maybe we just add a flag for this?

I wonder if we try to optimize the codepath first?

@lipzhu lipzhu force-pushed the redundant_prepareClientToWrite branch 2 times, most recently from 9b5d3a8 to b7b83bd Compare June 22, 2024 03:03
…tinuously.

Signed-off-by: Lipeng Zhu <[email protected]>
Co-authored-by: Wangyang Guo <[email protected]>
@lipzhu
Copy link
Contributor Author

lipzhu commented Jun 22, 2024

@madolson Thanks for your suggestions.

Some other observations while looking at the code more:

  1. putClientInPendingWriteQueue doesn't need to be called every time. It could just be called at the end of every command processed. We just need to set a flag that we need to do it.

You mean we cloud move below logic outside prepareClientToWrite?

if (!clientHasPendingReplies(c) && io_threads_op == IO_THREADS_OP_IDLE) putClientInPendingWriteQueue(c);
  1. We do a bunch of tests that are basically immutable for the entire duration of the command. CLIENT_MASTER, CLIENTER_REPLY_OFF, CLIENT_SCRIPT, CLIENT_MODULE never change. Maybe we refactor this into a single flag we check. It should give solid branch prediction while in the inner loop of the command.
  1. We check c->conn, which is in a different cache line. This is also an immutable property and we should only need to check this once. Maybe we just add a flag for this?

Got it.

I wonder if we try to optimize the codepath first?

Do you want to combine them together with this patch or separate?

@PingXie
Copy link
Member

PingXie commented Jun 22, 2024

We can find the the getClientType (inlined) and its associated symbols contribute more than 17% L1 cache miss, but didn't observe the prepareClientToWrite.

I assume this is the data cache misses?

Some other observations while looking at the code more:

  1. putClientInPendingWriteQueue doesn't need to be called every time. It could just be called at the end of every command processed. We just need to set a flag that we need to do it.
  2. We do a bunch of tests that are basically immutable for the entire duration of the command. CLIENT_MASTER, CLIENTER_REPLY_OFF, CLIENT_SCRIPT, CLIENT_MODULE never change. Maybe we refactor this into a single flag we check. It should give solid branch prediction while in the inner loop of the command.
  3. We check c->conn, which is in a different cache line. This is also an immutable property and we should only need to check this once. Maybe we just add a flag for this?

I wonder if we try to optimize the codepath first?

+1 on 1) and 3) but not sure about 2). flags should be hot in L1 d-cache so is the idea here about reducing the instruction count (via a combined flag), @madolson?

Copy link
Member

@PingXie PingXie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @lipzhu!

Can you confirm that you get the expected performance boost with these changes? and how much is that?

@lipzhu
Copy link
Contributor Author

lipzhu commented Jun 23, 2024

We can find the the getClientType (inlined) and its associated symbols contribute more than 17% L1 cache miss, but didn't observe the prepareClientToWrite.

I assume this is the data cache misses?

Yes.

LGTM. Thanks @lipzhu!

Can you confirm that you get the expected performance boost with these changes? and how much is that?

@PingXie I test 4 test suites which may benefit from this patch, they are all from https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites repo.

Test Name Perf Boost
memtier_benchmark-1key-list-100-elements-lrange-all-elements.yml 3.5%
memtier_benchmark-1key-list-1K-elements-lrange-all-elements.yml 13.1%
memtier_benchmark-1key-set-100-elements-smembers.yml 4.9%
memtier_benchmark-1key-list-1K-elements-lrange-all-elements.yml 10.0%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants