From the course: Learning AI with GitHub Copilot

Getting started

Hi everyone. Welcome to the first episode of Learning AI with GitHub Copilot. My name is Carlotta Castelluccio and I'm a Cloud Advocate at Microsoft, focused on AI and machine learning. I will not be alone on this learning journey. I'm happy to have here with me Gustavo. Hi, Gustavo. Hi, Carlotta. Thank you. My name is Gustavo. I am also a Cloud Advocate here at Microsoft, focused on AI and machine learning technologies and their practical applications. And I will be collaborating with Carlotta on this journey while also taking the opportunity to learn so many things myself. Awesome. I think we are both looking forward to start, so let's kick off. So this series is designed for machine learning beginners that would like to get started with Python. And in this first video, we'll be guiding you to set up your data science environment with Visual Studio Code and GitHub. So as prerequisites, we encourage you to download Visual Studio Code and create a GitHub account if you don't have one. Yeah. The most popular tool among data scientists is Jupyter Notebooks, and since they let you easily combine markdown text and executable source code on one canvas, that's basically what we're going to be using for our code here. We'll be using Visual Studio Code since it supports working with Jupyter Notebook and it also provides a built-in GitHub commands and extensions. So we'll be able to transfer data from repositories as well. Fun fact from the Jupyter name is that it's the combination from the three languages that it supports which are Julia, Python, and R. Carlotta, do you mind showing us how to use Jupyter Notebooks in VS Code? Yes, of course. Let me show you. So to create your first Python Jupyter Notebook in Visual Studio Code, you need to have Python installed on your machine and the Python Visual Studio Code extension installed. Once you have those, you can simply click on "Ctrl + Shift + P". And then click on "New Jupyter Notebook" here. Once you are in your new Jupyter Notebook, the main elements that you will see of a notebook are the kernel and the cells. So the kernel which here is Python 3.10.10 is a programming language specific process that lets you execute the source code written in the cell, while the cells are these boxes here which basically make up the notebook. For this example, we'll be using the default kernel we have here. And let me start writing a simple notebook. So let me start with a markdown cell. Just to give this notebook a title. For example, the title can be, This is my first Python notebook, and let me copy-paste it here. Cool. Great. So this is our title for the notebook. And then let me add a code cell, a Python code cell in which I'm going basically to print the same string. So I'm using the print command here. And if I execute the cell, you see that the kernel, the notebook is connected to the kernel and then it will be print out my first string in my first notebook. That's it. So we have our first Jupyter Notebook. Cool. And now that we created this notebook here, do we start doing data science right away or do we have to do something else? Well, not really. Python provides lots of packages to help us handle data visualization or modeling tasks like pandas, matplotlib or numpy. However, we need to install these packages in our environment to be able to use them. And that's where GitHub dev tools come into play and specifically GitHub Codespaces. Awesome. I know the GitHub Codespaces like a development environment hosted in the cloud that you can customize with configuration files. So working with a Codespace is basically just similar to working with a dev container in a virtual machine and all the required packages are installed in the remote environment when starting the Codespace. So I've used this in the web in the past, but is this something we can also use in VS Code? Yeah, absolutely. To do that, you need to install the GitHub Codespaces extension on Visual Studio Code. So if we go on extension here and we search GitHub Codespaces, we're going to find the right extension to use. I already have installed it, but to install it you just need to click this button here. And then you can create a new Codespaces using, again, the graphical user interface and starting from an existing GitHub repo. And the GitHub repo is the one in which you store the code that you want to execute in the GitHub Codespace. So let me show you. If you go to the Remote Explorer tab, you can create a new Codespaces. So here you can see all the repositories in my GitHub account. Of course, if you haven't logged into GitHub yet from Visual Studio, this will open a window in which you are going to log in to GitHub first and then you will see also on your end the list of repositories in your account. So I select the GitHub repository, then I select the branch from the repository from which I want to create a GitHub Codespace. And then, I can also choose the type of virtual machine I want to use for my dev container image basically. And I can choose among all of these machine types. Now since this will take a few minutes, I will just connect to an existing Codespaces I already created because I would like to, no, I don't want to say these. Okay. Because I would like to show you how you can configure your dev environment with GitHub Codespaces by just creating some configuration files. And I will show you in a moment which are these configuration files. So I'm in my GitHub Codespaces now, and the configuration files are basically included in this folder here, which is called .devcontainer. So let's start with the devcontainer.json file, which is one of the two main configuration files. So this file basically includes the Python version, for example, then the settings, the Visual Studio Code settings. For example, you have here the virtual environment we will be using as interpreter, but also the Visual Studio Code extensions, for example, we have Python Pylance that we will be using for running Python Jupyter Notebooks and also the GitHub Copilot extension that we'll be using further. And then we have the Docker file which is also referenced in the devcontainer.json file. And here you can find the parent image from which the Docker image will be based for my Codespace. And then we'll find also a reference to the requirements.txt file that you have seen for a moment before, which basically, it's a list with all the requirements I need in my data science dev environment in a way that once the Codespaces is created and launched, you already have all these packages, all these libraries for data science task already installed. This is great, right, Gustavo? Yeah, it's actually pretty awesome because, I mean, it covers everything that we're going to be using so far in this whole learning journey. So it's cool to have everything in one place. I feel like the only thing we're missing now is just introducing what's going to be accompanying us throughout this journey, which is our travel buddy, GitHub Copilot. Copilot is an AI-empowered pair programmer that offers autocomplete style suggestions as you code in different languages like Python. And you are going to receive suggestions from GitHub via Copilot either starting to write the code or just trying to write a natural language comment describing what you want it to do, even if you want to ask it questions or if you want to help it complete your code, it'll do it in an instant. Yeah, right. And let me also add how we can use it in Visual Studio Code in a way to get help without leaving our dev environment we've just created. So in the Codespace I've shown you beforehand, GitHub Copilot is pre-installed as prerequisites because we have defined it here in our list of Visual Studio Code extensions. But generally speaking, to get then Visual Studio, Sorry the Visual Studio Code, GitHub Copilot extension, you just need to type here, in the Extension tab, github copilot, and you will find the right extension here to install. So it's very easy to do that. Awesome. Well, let me conclude by saying that you can use or try every single one of the tools we've mentioned so far for free. Visual Studio Code is available for free. You can benefit from 120 code hours per month of GitHub Codespaces in your GitHub account for free, and you can subscribe for a 60-day trial for GitHub Copilot so you're able to try everything we're going to be doing in this video series. Yeah, exactly. Thanks, Gustavo. I think we are all well set up now, and in the next episode we'll be dealing with machine learning fundamentals and building machine learning demos with Python, and, of course, with the help of GitHub Copilot. So thanks, everyone, and see you at the next episode. Thanks.

Contents