We are excited to announce the integration of GitHub with the IBM Data Science Experience to enhance the collaboration between data scientists. Now you can combine code, data and visualizations in real-time with your colleagues.
Last year we announced a strategic partnership with GitHub. GitHub boasts roughly 12 million users, including some 60,000 organizations and many data scientists. "Millions of developers use GitHub to build personal projects, support their businesses, and work together on open source technologies," said GitHub's Todd Berman, VP of Engineering. "GitHub is a powerful addition to IBM's Bluemix that builds on our strategic partnership to dramatically advance the development of next generation cloud applications for enterprise customers."
This is how GitHub works with the Data Science Experience:
1- Connect your GitHub account with DSX
In user settings, there is a new tab called 'Integrations'. Add your GitHub access token here.
2- Connect a GitHub repo to a DSX Project
In the DSX project settings, you will see a new option to connect the project with a GitHub repo. Simply copy/paste the GitHub repo URL and that's it! It's that easy! It works for Public and Private repos.
3- Publish your notebook to GitHub
From the Projects page, you can publish your notebooks to your GitHub repo. The file will be pushed to the master branch, and any existing version will be replaced. We will enhance this to provide options to create branches in the future.
4- Import notebooks from GitHub
When you create a new Jupyter notebook, you can select to start from scratch with a blank notebook, import one from a *.ipynb file, or import notebooks from a URL. Use the import from URL to import one of the thousands of cool notebooks available on GitHub!
See Publish notebooks on GitHub for more details.
Collaborative code review with GitHub integration
By pushing your Jupyter notebooks to GitHub, you will have access to collaborative code review, which is essential. After creating a branch and making one or more commits, a Pull Requests starts the conversation around the proposed changes. Additional commits are commonly added based on feedback before merging the branch.
Today, data scientists make copies of their files and scripts and try to keep track of the versions of their work manually. This can work for small projects but when you are working with several data scientists and the project becomes bigger, this will quickly turn into a problem. GitHub integration solves this problem for you.
Do you find Git too complicated?
We still have a simplified Version Control available for you. This is based on Jupyter checkpoints and it is limited to 10 versions of each notebook.
You can also learn the basics of Git in 15 minutes : https://try.github.io/levels/1/challenges/1