Ungavumeli Isitoreji Sibe I-Bottleneck Eyinhloko Ekuqeqesheni Imodeli

Kuthiwa izinkampani zobuchwepheshe zifuna ama-GPU noma zisendleleni yokuwathola. Ngo-Ephreli, i-CEO ye-Tesla u-Elon Musk uthenge ama-GPU angu-10,000 futhi wathi inkampani izoqhubeka nokuthenga isamba esikhulu sama-GPU kwa-NVIDIA. Ngasohlangothini lwebhizinisi, abasebenzi be-IT nabo baphusha kanzima ukuze baqinisekise ukuthi ama-GPU asetshenziswa njalo ukukhulisa imbuyiselo ekutshalweni kwezimali. Kodwa-ke, ezinye izinkampani zingathola ukuthi ngenkathi inani lama-GPU likhuphuka, ubuvila be-GPU buba nzima kakhulu.

Uma umlando usifundise okuthile mayelana ne-high-performance computing (HPC), ukuthi isitoreji kanye nokuxhumana akufanele kudedelwe ngenani lokugxila kakhulu ekubalweni. Uma isitoreji singakwazi ukudlulisa kahle idatha kumayunithi ekhompuyutha, ngisho noma unama-GPU amaningi emhlabeni, ngeke uzuze ukusebenza kahle okuphelele.

NgokukaMike Matchett, umhlaziyi we-Small World Big Data, amamodeli amancane angenziwa ngenkumbulo (RAM), okuvumela ukugxila okwengeziwe ekubalweni. Kodwa-ke, amamodeli amakhulu njenge-ChatGPT anezigidigidi zamanodi awakwazi ukugcinwa enkumbulweni ngenxa yezindleko eziphezulu.

“Awukwazi ukuhlanganisa izigidigidi zamanodi enkumbulweni, ngakho ukugcinwa kubaluleke kakhulu,” kusho uMatchet. Ngeshwa, ukugcinwa kwedatha ngokuvamile akunakwa phakathi nenqubo yokuhlela.

Ngokuvamile, kungakhathaliseki ukuthi yikuphi ukusetshenziswa, kunamaphuzu amane avamile kunqubo yokuqeqesha imodeli:

1. Ukuqeqeshwa Okuyisibonelo
2. Isicelo sokukhomba
3. Ukugcinwa Kwedatha
4. Ikhompyutha Esheshisiwe

Lapho kwakhiwa futhi kuthunyelwa amamodeli, izimfuneko eziningi zibeka phambili ubufakazi obusheshayo bombono (i-POC) noma izindawo zokuhlola ukuze kuqalwe ukuqeqeshwa okuyimodeli, ngezidingo zokugcinwa kwedatha ezingacatshangwa kakhulu.

Kodwa-ke, inselele isekutheni ukuqeqeshwa noma ukuthunyelwa kwemibono kungathatha izinyanga noma iminyaka. Izinkampani eziningi zikhuphula ngokushesha amasayizi amamodeli azo ngalesi sikhathi, futhi ingqalasizinda kufanele yande ukuze ivumelane namamodeli akhulayo namasethi edatha.

Ucwaningo oluvela ku-Google ngezigidi zemisebenzi yokuqeqeshwa kwe-ML luveza ukuthi isilinganiso esingu-30% sesikhathi sokuqeqeshwa sichithwa ephayiphini ledatha yokufaka. Nakuba ucwaningo lwangaphambilini lugxile ekuthuthukiseni ama-GPU ukuze kusheshiswe ukuqeqeshwa, izinselelo eziningi zisasele ekuthuthukiseni izingxenye ezihlukahlukene zepayipi ledatha. Uma unamandla amakhulu okuhlanganisa, ibhodlela langempela liba ukuthi ungashesha kangakanani ukufaka idatha kuzibalo ukuze uthole imiphumela.

Ikakhulukazi, izinselele ekugcinweni kwedatha nasekuphathweni zidinga ukuhlela ukukhula kwedatha, okukuvumela ukuthi uqhubeke ukhipha inani ledatha njengoba uqhubeka, ikakhulukazi uma ungena ezimeni ezithuthuke kakhulu zokusetshenziswa ezifana nokufunda okujulile namanethiwekhi emizwa, abeka izidingo eziphezulu isitoreji ngokuya ngomthamo, ukusebenza, kanye nokukala.

Ngokuqondene:

I-Scalability
Ukufunda ngomshini kudinga ukuphatha inani elikhulu ledatha, futhi njengoba umthamo wedatha ukhula, ukunemba kwamamodeli nakho kuyathuthuka. Lokhu kusho ukuthi amabhizinisi kufanele aqoqe futhi agcine idatha eyengeziwe nsuku zonke. Uma isitoreji singakwazi ukukala, imithwalo yomsebenzi eningi idala izingqinamba, ibeke umkhawulo ekusebenzeni futhi ibangele isikhathi esibizayo sokungenzi lutho se-GPU.

Ukuvumelana nezimo
Ukusekelwa okuguquguqukayo kwamaphrothokholi amaningi (okuhlanganisa i-NFS, i-SMB, i-HTTP, i-FTP, i-HDFS, ne-S3) kuyadingeka ukuze kuhlangatshezwane nezidingo zamasistimu ahlukene, kunokuba kukhawulelwe kuhlobo olulodwa lwendawo.

Ukubambezeleka
Ukubambezeleka kwe-I/O kubalulekile ekwakheni nasekusebenziseni amamodeli njengoba idatha ifundwa futhi iphinda ifundwe izikhathi eziningi. Ukunciphisa ukubambezeleka kwe-I/O kunganciphisa isikhathi sokuqeqeshwa samamodeli ngezinsuku noma izinyanga. Ukuthuthukiswa kwamamodeli asheshayo kuhumusha ngokuqondile izinzuzo ezinkulu zebhizinisi.

Okokusebenza
Ukusebenza kwamasistimu okugcina kubalulekile ekuqeqesheni amamodeli asebenzayo. Izinqubo zokuqeqesha zibandakanya inani elikhulu ledatha, ngokuvamile ngama-terabytes ngehora.

Ukufinyelela Okufanayo
Ukuze kuzuzwe ukusebenza okuphezulu, amamodeli okuqeqesha ahlukanisa imisebenzi ibe imisebenzi eminingi efanayo. Lokhu kuvame ukusho ukuthi ama-algorithms okufunda komshini afinyelela kumafayela afanayo ezinqubweni eziningi (okungenzeka ukuthi kumaseva aphathekayo amaningi) ngesikhathi esisodwa. Isistimu yokugcina kufanele isingathe izimfuno ezihambisanayo ngaphandle kokuphazamisa ukusebenza.

Ngamakhono ayo avelele ekubambeni okuphansi, ukuphuma okuphezulu, kanye ne-I/O enkulu ehambisanayo, i-Dell PowerScale iyindawo ekahle yokugcina ehambisana nekhompyutha esheshiswa yi-GPU. I-PowerScale yehlisa ngempumelelo isikhathi esidingekayo kumamodeli okuhlaziya aqeqesha futhi ahlola amadathasethi we-multi-terabyte. Ku-PowerScale isitoreji se-flash yonke, umkhawulokudonsa ukhuphuka izikhathi ezingu-18, ususa amabhodlela e-I/O, futhi ungengezwa kumaqoqo akhona e-Isilon ukuze kusheshiswe futhi kuvulwe inani lenani elikhulu ledatha engahlelekile.

Ngaphezu kwalokho, amandla okufinyelela amaphrothokholi amaningi e-PowerScale ahlinzeka ngokuguquguquka okungenamkhawulo ekusebenzeni kwemithwalo, okuvumela idatha ukuthi igcinwe kusetshenziswa umthetho olandelwayo owodwa futhi ifinyelelwe kusetshenziswa enye. Ngokukhethekile, izici ezinamandla, ukuguquguquka, ukulinganisa, nokusebenza kwebanga lebhizinisi kweplathifomu ye-PowerScale kusiza ukubhekana nezinselele ezilandelayo:

- Sheshisa izinto ezintsha ngokufika ezikhathini ezingu-2.7, wehlise umjikelezo wokuqeqeshwa oyimodeli.

- Susa izingqinamba ze-I/O futhi unikeze ukuqeqeshwa kwemodeli esheshayo nokuqinisekisa, ukunemba kwemodeli okuthuthukisiwe, ukukhiqiza okuthuthukisiwe kwesayensi yedatha, kanye nembuyiselo eyengeziwe ekutshalweni kwezimali kwekhompuyutha ngokusebenzisa izici zebanga lebhizinisi, ukusebenza okuphezulu, ukuvumelana, kanye nokulinganisa. Thuthukisa ukunemba kwemodeli ngamasethi edatha ajulile, anokulungiswa okuphezulu ngokusebenzisa amandla afinyelela ku-119 PB womthamo wokulondoloza osebenzayo kuqoqo elilodwa.

- Finyelela ukuthunyelwa ngesilinganiso ngokuqala ukulinganisa okuncane nokuzimela ngokuzimela, ukuletha ukuvikelwa kwedatha okuqinile kanye nezinketho zokuphepha.

- Thuthukisa ukukhiqiza kwesayensi yedatha ngezibalo zasendaweni nezisombululo eziqinisekiswe ngaphambilini zokuthunyelwa okusheshayo, okunobungozi obuncane.

- Ukusebenzisa imiklamo eqinisekisiwe esekelwe kubuchwepheshe obungcono kakhulu bokuzalanisa, okuhlanganisa ukusheshisa kwe-NVIDIA GPU nereferensi yezakhiwo ezinamasistimu e-NVIDIA DGX. Ukusebenza okuphezulu nokuvumelana kwe-PowerScale kuhlangabezana nezimfuneko zokusebenza kwesitoreji kuzo zonke izigaba zokufunda komshini, kusukela ekutholweni kwedatha nokulungiselela ukuya ekuqeqesheni nasekuqondeni okuyimodeli. Kanye nesistimu yokusebenza ye-OneFS, wonke ama-node angasebenza ngaphandle komthungo phakathi kweqoqo elifanayo eliqhutshwa yi-OneFS, elinezici zezinga lebhizinisi ezifana nokuphathwa kokusebenza, ukuphathwa kwedatha, ukuvikeleka, nokuvikelwa kwedatha, okuvumela ukuqedwa ngokushesha kokuqeqeshwa okuyimodeli nokuqinisekiswa kwamabhizinisi.


Isikhathi sokuthumela: Jul-03-2023