Overview
I am Guobao LI, an undergraduate from SCUT and graduate from Polytech Nantes. It is my pleasure to take part in the project SYSTEMML-2083 with the Apache community and my mentor Matthias Boehm in the GSOC. The objective of SYSTEMML-2083 is to design and implement the language and runtime of parameter server in Apache SystemML. Finally, we have realized and tested the local and Spark data-parallel parameter server including the update strategy (BSP, ASP) and update frequency (Batch, Epoch), data partitioning schemes (DC, DR, DRR, OR) in this project. I am glad to see that our contribution will be integrated in the next release of Apache SystemML.
Subject in JIRA:
Here is the link pointing to all the issues in JIRA.
Pull Requests:
Here is a list of Pull Requests related to the project and all the code has already been merged into Apache SystemML master.
Category | Pull Request | Commit |
---|---|---|
Language extension | PR-817 PR-764 | SYSTEMML-2299 SYSTEMML-2084,2317-20 |
Local runtime | PR-790 PR-785 PR-783 PR-782 PR-781 PR-780 PR-777 PR-771 | SYSTEMML-2416 SYSTEMML-2389 SYSTEMML-2381 SYSTEMML-2380 SYSTEMML-2364,66-88 SYSTEMML-2359 SYSTEMML-2344,48,49,52 SYSTEMML-2085 |
Spark runtime | PR-814 PR-808 PR-805 PR-799 PR-793 | MINOR SYSTEMML-2420,2457 SYSTEMML-2420,2422 SYSTEMML-2419 SYSTEMML-2418 |
Bug fix | PR-809 PR-802 PR-791 PR-789 PR-787 PR-766 | SYSTEMML-2469 SYSTEMML-2446 SYSTEMML-2403 SYSTEMML-2413 SYSTEMML-2392/8,2401/2/6 MINOR |
Documentation | PR-816 | SYSTEMML-2090 |
Experiments:
While implementing the parameter server, we also launch the experiments for analyzing the performance and detecting the potential bugs proactively. All the experimental scripts and results have been pushed to this github repository.
Documentation:
Here is the link pointing to the documentation about paramserv function. Note that this is a snapshot of the Apache SystemML language documentation as of 08/05/2018.
Presentation:
I have got a very precious opportunity to give a presentation about my work before the Apache SystemML community. Inside the provided slides, there is a complete overview of the project including what is parameter server, how to use paramserv function, the demonstration and the experimental result. Please feel free to take a look on the slides.