AI, Machine Learning
POSTED ON AUGUST 21, 2018 Author
This is a transcript from a recent web seminar provided by KTBYTE staff and student. It contains the following parts. Feel free to jump to the your interested subtopic.
Part 1. INTRODUCTION – WHAT IS AI AND MACHINE LEARNING
Pratik: Hi, my name is Pratik. I’m a rising junior at Acton Boxborough Regional High School and I’ve been taking KTBYTE classes for the past four years, from introductory Java lessons all the way to machine learning CS85 lessons. I’m doing an internship at Dana Farber Cancer Institute this summer, so it’s mainly to do with machine learning and deep learning.
Benjamin: Today we’re going to talk about AI. AI is a very popular field right now both for academic researchers, people in industry and sometimes for students. And it’s a complicated field because no one’s exactly sure what AI is. Obviously, a lot of people say AI is demonstrating some kind of intelligence by computer. We are going to talk about what is AI, history of AI, what AI can do. Then we will let Pratik talk about his experience. Lastly we will answer your questions and describe briefly KTBYTE core classes.
Two kind of buzzwordy terms these days are: machine learning (that is using a computer to learn from information) and deep learning (which are specifically neural networks that consist of layers of neurons in the computer that have similarities to our brain, but allow the computers to make predictions). A lot of AI these days is being able to take an input like a picture of a cat or dog, and tell you whether it’s a cat or a dog. Or Pratik, what kind predictions or are you trying to make, for example, in some of your projects.
Part 2.: AI HISTORY
Benjamin: For the first part of our talk, we’re going to talk a little bit about the history of artificial intelligence. A lot of people see AI applications towards cancer research and say, “wow, that’s amazing!” We [can] look at, for example, the self driving cars that are coming out, or Alexa being able to understand your voice when you’re talking to a device. I think there’s a lot of promise in this field, but it’s also interesting to see where we came from. So we’re gonna start with a little bit of history first before we dive too much into the applications.
AI is a pretty old field actually, dating back even to the fifties, sixties, and seventies. Back then there were programs that, for example, could play checkers and might for example, look to see whether there’s a piece that you can jump over to take over some kind of piece in checkers. And a lot of these used if statements and logic, the kind of programming that a lot of our first or second year programmers, sometimes our 9 and 10 year olds, learned. But back in the fifties and sixties, this was very revolutionary because prior to that, computers couldn’t really make decisions. If you had a workstation in an office or at a research center, you could suddenly allow computers to behave in some kind of logical way.
For example, one popular program in 1965 called Eliza pretended to be a psychotherapist for someone who you talked to if you have depression or [other similar mental illness]. How Eliza worked is if you typed a message like, ‘Oh, I’m very sad,’ Eliza would notice the word sad and would speak back and say, ‘Oh, tell me a little bit more about why you’re sad,’ or ‘When did you start feeling sad?’ There were also programs in 1979 that allowed you to see if you had a disease. [For those cases], if you type like, ‘Oh, I have a fever,’ it would find all articles for diseases that had the keyword ‘fever’ in it. And I think for anybody who grew up in the last 20 or 30 years, 40 years, this seems very easy, very obvious, and not artificial intelligence.
Throughout the decades, and even now, people think computers are going to become almost sentient, almost replacing humans. But every time something new comes out, we get accustomed to that innovation over time and we quickly move onto the next big thing. Looking back at early logic based AI programs, all they could do is play very simple games of checkers. In 1994, the first programs that could defeat humans in checkers came out.
People didn’t believe that computers would be able to advance and play chess, due to the high combinations of moves chess requires. Just a couple years later, 1996/1997, a computer program called Deep Blue was released. In the news, Garry Kasparov, a very famous chess champion from Russia, was defeated by this computer. Computers were finally able to run these data structures, because of how powerful mainframe computers were becoming. It was pretty amazing because suddenly we thought, ‘wow, computers will beat humans at any kind of games’ – maybe at some point, a computer would be able to diagnose a disease better than a human could. And this is actually not true. Obviously it didn’t happen in the nineties. But, you can see that a lot of these concepts came out early but became popularized in the 1950s and mid 1900s. That’s what became ‘AI’ in the late nineties. And even back then, in the seventies, eighties, we had these concepts such as neural networks and back-propagation. So, it’s interesting to see this progress because once we hit the nineties, and you’ll see later the 2000s, people started thinking, ‘Wow, it’s going to be the support vector machines, these statistical approaches that are sometimes able to do character recognition – or even neural networks, deep neural networks. That became popular in 2010, 2011, 2012.
So the most recent trend is these deep neural networks. You can see the chart below. (insert chat here).
It’s part of an image recognition contest called the ILSVRC contest. And you have this red bar, which is the error. You can see the error in 2010 was around 28%. And by 2014, it dropped dramatically to 7%, which means accuracy rate was 93%. What happened is from 2010 to 2014, the number of people doing this contest, mostly universities, using GPUs, they increased a lot. So a lot of people started using these gaming processors, because people played a lot of video games in the 90s and 2000s. So we have this new consumer hardware. This new technology, which included vectorization and neural networks that came about in the 1970s, it became really useful in the 2000s. In 2010, and so right now when people say artificial intelligence, usually they’re talking about deep learning.
This chart is very important. Because it means that when you’re learning machine learning, artificial intelligence, what area of history you’re focused on impacts what kind of actual technologies and techniques you’ll learn. So it’s very likely that after another 10 years or another 15, 20 years, we’ll see AI go through another winter and we’ll say, ‘Oh, AI is dead, deep neural networks – maybe they aren’t that good,’ and we’ll move onto the next era by which some new technology is implemented. Another theme that you should take from this is that with every single change in artificial intelligence, you saw corresponding changes in hardware. In the beginning it was mainframe computers. Then personal and distributed computers, and now graphics processing units. Nowadays, with these new GPU based neural networks, computers can recognize images, they can recognize pedestrians in the streets, do facial recognition, do language and sentiment analysis, do radiology and beat humans at games such as Alpha Go.
Recently there’s some research papers that are just coming out where computers are trying to play these video games such as League of Legends.
Part 3.: AI APPLICATIONS
Talk about cutting edge applications like text recognition, object classification, face recognition, translation, self driving cars. What we find with many of these technologies is that it’s easy to get to maybe 90%, 95% percent accuracy, but oftentimes the last 5% is very, very difficult. For example, in the case of text recognition, a lot of it is the context; the culture or the history or the current events that the text is based off . And when it comes to self driving cars, a lot of is based off what predicted human behavior. The other drivers or people on the road and local roads. And so a lot of these things are still not there yet, but you can see that these applications are coming out. Other popular topics that have been attacked by neural networks recently have been looking at biological, chemical or physical simulations, or biological, chemical or physical data.
So Pratik mentioned mammograms and cancer. Even when you’re not detecting cancer, you want to just tell whether a cell is dividing like undergoing mitosis versus those that are not undergoing mitosis. My first internship back when I was Pratik’s age, I was counting cells using a clicker machine. It was really painful. Eventually we tried to use a program that would find color barriers and use flood fill to detect barriers. This was very hard and nowadays that kind of algorithm is very easy.
Pratik: Also, in the science field as well, I was reading this paper that deep learning can be used for optimization with the GANs (Generative Adversarial Networks) stuff that’s coming out; like the generative networks. You can actually make the most optimal thing; like you can use that in engineering. You can also use that to make the most optimal drug. So using deep learning is also coming up in the optimization field as well.
Benjamin: Absolutely. You mentioned drugs; this next thing is toxicity predictions of chemicals, and there’s a good number of a pharmaceutical applications these days by which rather than starting with clinical trials, you’ll start with a data-based technique, where they will take some kind of compound and run it against other similar compounds to try to estimate what its behavior might be like both in vivo and in vitro. And so previously, these kinds of wet lab simulations might be very expensive, but now when you have a data driven technique, it might not be perfectly accurate, but oftentimes it narrows the search space so you don’t have to do as much stuff in the lab as you might’ve had to do before.
And similarly with physical applications before you had very complicated programs to simulate stress and strain of bridges, or simulate, in this case fluids. And recently we’ve been able to create computer programs that say, ‘Okay, by looking at all of these previous fluid simulations, can you estimate what future fluid simulations will look like?’ And they can try to derive some of the rules just by watching how things behave.
And so definitely, deep learning is impacting science as it is right now. But we also see it not just in the sciences, but in a lot of regular consumer kind of applications. Some of these are really cool, some of these are pretty scary. You see it in image coloring; so if you want to take a picture and make it a certain color. Or, you can even, for example, take a grayscale picture and have the computer guess what the colors should be. And so these are examples of grayscale pictures that the computer colored, and it does a remarkably good job. (Insert a picture here)
So the right side is the true color, which the model never saw. The output is the middle image. And so you can see from this image that it figures out that grass is green; figures out that certain animals are different colors from the grass, and all these kinds of things. Also if you just want to change a picture from one color to another, that’s being made a lot easier. You can also, for example, take a picture – in this case the Eiffel Tower – and have a program try to detect where you are in the world. And so this photo geolocation stuff is very useful for a lot of these imaging applications, like Instagram or Facebook. You take a picture and they’ll automatically be able to determine from the foliage, from the architecture, from the kind of people, what they’re wearing the clothing and the fashion; it will try to estimate the location on the world.
And previously this would be extremely difficult for a regular algorithm. You can see it applied towards consumer applications, artistic applications. A lot of things that we thought was really impossible a short time ago is now possible. One is like, this picture here where you take a picture, cut off a piece of it and try to tell the algorithm to fill in the blank. And there’s a really impressive video by NVIDIA that came out just in the last six months in April. It’s called ‘AI constructs photo with realistic results’ and you can see people drawing these white areas and the algorithm on the right side, filling in those areas. So for example, we have this bridge in the picture, and the algorithm gets rid of the bridge and fills it in with a realistic rocky landscape to fit in with this landscape here.
And you can even get rid of this enclosure. You can do this in rooms. You can do this with faces. You can see here, the face on the right, it looks kind of wrong; but that’s because the eyes were generated by the algorithm. A lot of these applications can be useful for the arts; and while previously if you use Photoshop, you might be able to resize things; make for example, objects look smaller or bigger. Now suddenly you can add realistic elements to an image without having to draw them painstakingly or copy them from one image to another. And so these are really powerful artistic and graphical applications of machine learning. Here on the right side, we have some image captioning algorithms which allows you to, for example, take an image, determine portions of it and label them with text. Or, you also have these super resolution algorithms. On the left side, you have the original picture. On the middle side, you have the picture that was fed to the algorithm. And on the right side you have the picture of the algorithm produced. So it’s able to take a low resolution image and create a high resolution image. I remember when I was a kid watching CSI, NCIS, or some kind of crime show and they would have a picture or video of some zoomed in like grayed out image. It’d be like ‘Enhance! Oh, now we know what the drivers like license plate is, now we can chase them.’
And at the time I was like, ‘That’s impossible! I’m a programmer; I know that’s impossible.’ And now you know what? They can do it. But the caveat is it might not be the right license plate number because the computer is trying to just guess and fill in the blanks. So what computers can do now is they can fill in the blanks and make things that can trick a human. But it may not be correct, right? Like if, you blocked out her nose and she used to have a dimple or a freckle, it might not be there in the reconstructed image. And so a lot of these algorithms are very powerful because they can do things that are indistinguishable to human labelers.
A couple more papers before we move on; one is image synthesis. So we have here, this image, you can see the colored one, which is called a semantic map.
A human draws this and says, I want the road here, I want a car here, I want trees here, I want pedestrians here. And this other image which looks like it’s a photograph that’s actually generated by the algorithm. So the algorithm will be able to a visual elements from these semantic maps. Currently, if you want to make a video game, you have to import lots of complicated textures. You have to render all of these images and try to make it look realistic. It looks like these new algorithms may enable you to make a game that has all sorts of cool elements without having to even program any of the models they use in the game. These algorithms allow you to, for example, change faces. There’s some very scary videos of people making Barack Obama or Donald Trump say something that didn’t actually say because the algorithm can simulate their mouth movements, simulate their voices. Scary stuff. But there’s also algorithms being released to try to detect these. You can have algorithms that determine what’s making a sound, and they could figure out that it’s a person making a sound, and it’s coming out of their mouth. Or it’s a violin, or it’s a cat. They can figure out from video, what things are making what sounds. There’s also pictures that are able to take completely changed scenes and try to fill in not just a picture of a face, but take, for example, in this bottom right picture, a picture of a bay, and be able to construct boats or fill in the rest of the bay; so a lot of these cool image-based technologies that weren’t even imaginable a short amount of time ago.
Part 4. SOFTWARE AND HARDWARE REQUIREMENT FOR CODING AI
One question from audience: ‘Can you put AI in Python?’
Pratik: Yes, you can code AI in any language you want. Whether it’s easier to do it in Java or Python; I would choose Python because Python gives you a better framework to do AI. Some of the packages that I use, like CARIS, is specifically designed to be used in Python and is very easy to use in Python because you don’t have to pre-code all the functions yourself. All you have to do is called the pre-coded functions for you and just organize them in whatever way you want. So it’s way easier to do it in Python, and Python is the preferred language in AI. USACO prefers Java because it’s more algorithm based, it’s not machine learning based; it’s more ‘Can you create your own algorithm to solve a real world problem?’ So that’s why they use Java there.
Benjamin: There’s a bit of history here in the sense that a lot of the academic researchers that are working on machine learning right now are building their libraries in Python; and so, because it’s a relatively new field, there isn’t great library support in every single language. This is a very different from, for example, make a program that converted one type of image to another type of image, like a bit BMP to a PNG or something like that. That kind of library, you could easily find in pretty much every single modern language. But yeah, Pratik is right. Right now, CARIS, TensorFlow, Theano – those are just way easier to use in Python. Then specifically, Deeplearning4j, which is the most popular Java-based framework.
Pratik: I didn’t know Deeplearning4j existed. I usually use Python, but yeah, that doesn’t mean that if you want to use a specific language you can’t do machine learning. It just means it might be a lot harder to get started, or that you’ll have a lot more trouble finding right examples that you can use. Or it might be more difficult getting it to run on your computers.
Benjamin: Can you speak a little bit about hardware for a second Pratik? Because you’ve been doing some research in different environments, like ‘How does hardware affect how you do machine learning?’
Pratik: So hardware mainly affects the amount of time that your algorithm might take to run.
Right now the hardware that I’m using is connected to a virtual server. So what I’m using is a pretty good graphics card; it’s the GTX 1080 TI. And we also have a TITAN X here. Those are two really good GPUs. Compared to that, if you were using a GTX 960 or 970 that you have in your house, the only thing that’s affected is the amount of time that your algorithm will take to run. Let’s say with the GTX 1080, you could probably run a really big deep neural network in about 2 to 3 hours, while a not so good graphics card would take much longer to run.
Benjamin: Yes, one thing I like to add on is that if you don’t have a GPU, it could take a really long time to run. If you’re running on like your thin and light Ultrabook laptop, your MacBook AIR, or something like that. It’s going to take like two weeks.
Part 5. WHY STUDENTS SHOULD LEARN CS AND HOW TO START?
A common question we get now is, ‘Wow, AI is doing all these amazing things. It’s replacing truck drivers, it’s replacing doctors, it’s replacing graphic designers, it’s replacing paralegals, huh? Is it true that AI is going to destroy all these jobs?’
It’s a complicated question because I think most people who are researchers don’t think too much about the economic implications. But I think most people who are studying the economic implication would say that it’s not going to necessarily destroy as many jobs as the newspapers or media might say. The most likely biggest change is, it’s going to change a lot of jobs. The analogy being, if you were a graphic designer in the 1980s and ’90s, and you didn’t use Photoshop, that’s okay. But if you’re a graphic designer now, and you don’t use Photoshop or Illustrator or Figma, your life is going to be a lot more difficult. And in the future, there may be truck drivers, but maybe more of their responsibility is going to be the regulatory, the licensing, the bureaucratic, and also the managing of several trucks rather than just one single truck.
A doctor spend less time skimming through many mammograms or many x-rays, but maybe they would focus on a few that the computer is not so sure about. Also of course doctors have a lot of other work. Even radiologists have a lot of other work besides just determining whether a tumor might be benign or malignant. So a lot of this is going to shift work from something which, previously was a recognition problem, and humans are good at recognition product, but now computers are also good at recognition problems. To maybe artistic, or communication, or creative, or a thing that requires synthesis or balancing different factors that influence different kinds of stakeholders, like a patient, or other cars on the road. These things that computers still can’t do and most researchers don’t think that computers are going to be doing these higher level tasks in the next 10 to 20 years.
So this means that some jobs will be changed, maybe a small number of jobs will – not small like one or two, but small like on the national scale. A small percentage is going to be destroyed. Also a small percentage will be created out of thin air. Maybe there are new people who work specifically on optimizing, tailoring and understanding these algorithms. Because some of these algorithms, unfortunately, are also black box algorithms where now it’s great that the computer can make a decision. But oftentimes the computer can not tell you why it made a decision.
Another common question we have is, ‘What should students do and how should they learn? Given that these are kind of the big trends, should I be focusing as a student on learning one specific neural network, or should I be learning Python or should I be learning core computer science?’ I’ll speak first to KTBYTE’s general experience, and then Pratik, maybe you can talk about it afterwards. In general, we find that the most important thing is for students to have a really good core computer science experience, and that’s a prerequisite to being effective at machine learning. If you learn one specific algorithm or you learn one specific tool that might make you have a definite edge in the short term, maybe like one or two years. But over time these tools will get more and more common, and like any tool, they change over time. And so just like computer science hasn’t really changed necessarily for dozens of years, but operating systems and graphics and image editors have changed. Similarly. A lot of the statistical and data structure technologies are going to stay very similar. But, what particular variant of a neural network might change, or what particular library they use to implement on your own network might change from year to year basis. This means when students are learning, they should be focusing on CS. They should also be focusing on research skills. A lot of the problems that students have when they’re doing this kind of work is that it’s very different from, like Pratik mentioned earlier, the USA Computing Olympiad, or like school AP Computer Science. Oftentimes when you’re doing research, it’s actually hard to even know what questions you should be asking for finding the answers to the questions. And it’s really challenging to budget your time because if you have five or six questions that you might be able to ask, and you can only answer three of them, then which ones should you focus on trying to answer, right?
Pratik: Okay, as Ben mentioned, core CS is very important because for neural networks and deep learning, the ideas and the algorithms, they pretty much change every single week. You need to read the newest paper to keep in touch with that. The core part of machine learning and deep learning is knowing how to work with data and knowing how to change the data in order to meet the requirements that you have. And the majority of that is wrapped around core CS; like data structures and algorithms. So it’s important to know core CS before you delve into machine learning.
Part 6. STUDENT EXPERIENCE OF TAKING KTBYTE AI CLASSES
Benjamin: What was the experience managing your time and what was surprising when you were doing research? Because I think a lot of people when they start research are like ‘I’m going to build this model, and then I’m going to like discover this really amazing thing.’ Is it like that?
In KTBYTE AI classes, students are gonna learn time management, how to do things in parallel, how to communicate their findings, how to use resources and other people’s experience and other students and advisors to be able to plan their time. Also to work around deadlines and institutions if they’re submitting these projects or working towards a research project. When we had one student to try to distinguish between Harry Potter books and Harry Potter fanfiction. this student, she actually had like four or five ideas originally. And this was just one of the ideas that it turned out in an early version of her program that she was able to get good results on. And so she was able to dive down that path. But the process of arriving at that cool project, if you just saw our paperwork, you don’t see 90% of the work. And I think Pratik can speak more to this than I can because you’ve been doing a lot of that recently.
Pratik: One really important thing in research is that the first thought you have will never really be the project that you have at the end. Many times, the project that comes out at the end will be wildly different, or even the opposite of the thought you had at the very beginning. It’s a very fluid process. It goes through many changes, and depending on what like what problems you have or what roadblocks you hit. You’ll need to make changes and adjustments in your research question as well as the methods that you’re pursuing in order to get a feasible project at the end. And many times, it might not even be like the project that you want, but it’s still important to go through the process and keep building because it is a long process. It’s not going to take two weeks to come up with a novel algorithm that suddenly cures cancer. It’s a very fluid process and you just need to be prepared to go through it and be prepared for all the roadblocks and all of the challenges that you might face.
Benjamin: Yeah. And I think you did a couple science fair projects. One was using a neural network, one was not even using a neural network; it was using a more traditional algorithm. And typically when people talk machine learning or deep learning, they’re talking about two classes of problems. One which is called unsupervised learning, which is try to categorize things that are different from each other. And another which is called supervised learning, which means I’m given some training data which has, for example, a picture of a cat and you tell the computer, ‘this is a cat,’ and here’s a picture of a dog and you tell the computer ‘this is dog,’ next time you give it an unknown picture and you ask the computer, ‘is this a cat or is this a dog?’ Right. And then what? Pratik, can you kind of use that analogy to describe how this might apply to, for example, cancer research?
Pratik: Some of the projects that I’m doing mainly have a medical focus, like telling whether a tumor, or an image of a tumor is benign or malignant, or saying whether it’s harmful or like predicting with features of cancers images basically. And this can be for breast cancer, lung cancer, anything where there’s x-rays or images involved.
Pratik: Yeah. So going along with that analogy, if you have a data set consisting of let’s say all mammograms, or x-rays of breast cancers. You can have a set of the data be benign tumors, or tumors that are harmless, as well as a set of malignant tumors. And what in this case you do is you feed the neural network or the machine learning algorithm, the dataset with both sets of benign and malignant images. And you ask it to say, if I give you like an unknown image or a new image of a person that has potentially breast cancer, could you tell me if it’s benign or a malignant tumor? And this is for totally new data that it hasn’t seen before. So that’s, that’s basically a connection.
Benjamin: Did you take 82 with us Pratik?
Pratik: Yeah. I took 82 and 84 and 85. 82 really gave me a mathematical understanding of what goes behind machine learning. I think that’s important because before you delve into these pre-coded functions, you need to actually learn what they do; and sometimes in CS82, you needed to know some amount of Calculus to understand what was going on. That was kind of hard to grasp at first. But once I saw the application in 84, I saw where kind of the calculus was coming into picture and I, of course, had to do some research on my own. 82 mainly gives you a good mathematical grasp of what’s going on in machine learning before you go to 84, where we do purely example based learning.
Benjamin: I do want to add a caveat that we don’t require students to know Calculus for any of these classes. And then what was your experience in 84 and 85 like?
Pratik: 84 was straight diving into deep learning and neural networks and seeing how we can implement them, and using them to do regression tasks, playing games, classifying objects. In 82 we did a lot of machine learning algorithms, like linear regression, Bayes and things like that. But deep learning is mainly focused on neural networks and working with Python to create neural networks. And at the end of 84, you do a small research project, but it’s not the intensity that it is in CS85 where you solely focus on your own research project the entire time. My project in 84 was something to do with poker hands. It was predicting what kind of cards you need for a poker hand given the cards in the deck or something like that. I spent two or three weeks on it because it was at the end. That was the project in 84. But the project in 85 was a lot deeper, and delving into the depths of machine learning and seeing what I could do to find a novel application.
Benjamin: That was your mammogram project?
Pratik: Yeah, it was basically using image augmentation to efficiently classify mammogram images, breast cancer images, using deep learning. So that was a very deep, in-depth project that took the entirety of the semester; maybe multiple semesters as well.
Benjamin: Before we leave this slide, I do also want to talk about why AI/machine learning class is a research track. A lot of the other classes like our core classes are projects, or games, or apps, so it feels like an engineering track. And a lot of the problems are algorithmic problem solving and very much like a contest. The styles are very different. When you’re doing a software engineering project. If it’s a small project then it’s very self contained. But if it’s a large project, you’re learning a lot of software engineering tools. For example, you learn on how to deal with a big code base or use version control to share code with other people or whatever. Then if you’re doing the computing Olympiad, which is a different thing as well, you’re focusing on those algorithms, how to solve them in the quickest amount of time, how to prepare your code so that you’re ready to dive in during the contest. While here, it’s a completely different field again, because these machine learning classes, they’re long projects. So long there may be more than a year that you’re working on them where the project’s changing from a week to week basis. And so when students ask us, ‘Should we take these machine learning classes,’ we oftentimes ask them what kind of personality they are. If they like to see a result immediately, and have that work, then usually these are not very good classes for them because that means you’re going to be disappointed many times.
Pratik: I was hoping to add that I was also one of these kids initially. I like to see a result immediately. That’s why I did USACO. Two years ago I was doing USACO very, very seriously because how USACO works is it grades your code in an instant and tells you whether you’re right or wrong. it’s kind of like an instant gratification kind of thing. But through this process I realized that research is never instantaneous, like you do something in one second and it gives you the right answer. It’s a very, very fluid process that you need to go through. Sometimes you just need to go through the motions, and it takes a very long time to get a result – if you get it resolved.
Benjamin: It is a research process. That said, the achievements are pretty cool. We have a lot of the students who take their academic papers that they produced in 84 and 85 and submit them to a science fair. Pratik, can you speak a little bit too about that experience as well? Like what’s the difference between preparing for a science fair versus actually working on machine learning and working on your project? Like there was the paper and the presentation part. What was that experience like?
Pratik: Yeah, so, preparing for the science fair actually took a little bit longer than doing the project itself. The entire timeline was from last summer, starting at an August timeframe when I was thinking about what I was going to do to my project, all the way up to March for the regionals. From about August to October, November, was actually doing my project, and then from December all the way to March was cleaning stuff up, writing my paper, making my poster, preparing a presentation, and thinking. For these science fairs, it’s not really about presenting your work. It’s about selling your work in a way that will appeal to a population that doesn’t really understand much about deep learning. Because deep learning is a relatively new field even in AI and machine learning. So many people won’t really know the specifics or what’s going on in deep learning. You really have to make it so that they can understand. As well as take a very complicated topics such as your project, and make it sellable in a way people can understand and appreciate the work you’ve done over a period of, let’s say four or five months. So my first science fair project wasn’t really a deep neural network, or machine learning, or a deep machine learning algorithm. It was concerning a ‘k-nearest neighbors algorithm,’ which I coded in Java. It was to do with predicting a point of muscle fatigue using the ‘k-nearest neighbors algorithm.’ That actually got me to the regional and state science fair last year, as well as the national science fair this year.
This year was actually my project regarding mammograms and breast cancer, and I haven’t applied it to the national science fair yet because the deadline for that already passed; but I’ll do that next year. This one actually took me to the regional and the state level science fair.
Benjamin: One thing I forgot to mention is that when you’re doing a lot of this work, besides the model part and the presentation part, there was also a lot of systems you had to learn, right? A lot of tool learning. Can you talk briefly about that?
Pratik: Yeah. I had to learn an entirely new programming language because I’d mainly worked with Java before, so switching to python was kind of a jump. But mainly, the functions are the same and the packages, whether you’re using them in Java or Python are totally new and something you have to have to understand at the math level first. So using, the CARIS packages, once I got over the learning curve it was pretty easy to implement them. I also had to learn how to use the virtual machine that we have here at KTBYTE, to log into the virtual machine as well as R to parse through my data, augment my data, clean it up, do the basic janitor work. Linux as well. There’s kind of a learning curve associated with system’s processing programming languages, but once you get over that, it’s fairly easy to get into the machine learning part or the deep learning part of it.
Benjamin: We had a couple of questions, one from Howard: ‘How long will it take to finish 82, 84 and 85?’ Most students do 82 in one semester and 84 in one semester. Some students will spend two, or three semesters on 85, because 85 is our independent study, with some tutoring, and with some kind of a system support. So 85 is unique every single time you take it, there is no curriculum – while in 82 and 84, there is a curriculum.
Pratik: How long you take 85 depends on how much you want to get out of it. If you want to do science fairs for two or three years, then you would take 85 for a lot longer than one semester; whereas for 82 and 84 there’s a curriculum, so once you feel like you’ve understood the material, then you transition to 85. I started 82 in the middle of last year, or the year before, and 82 and 84, I finished in about half a year, maybe three quarters of a year. Then 85 I’ve been doing since the beginning of last year.
Benjamin: Here’s a couple slides of summary of why you should learn machine learning, or why you might not want to learn machine learning. If you like research, you can definitely do it. And if you do want to learn AI, don’t just focus on the most recent models. What the most recent fad is in the last year or two. That means all the things that Pratik was talking about. You should also pace yourself a bit. We do recommend that students plan on doing the high level classes by their sophomore, or at most junior year of high school, because you start getting pretty busy by your junior year. This means that we like students to try entering the track in their eighth and ninth grade. If you want to do a science fair project, it’s good to start 82, or to plan to start 82 when you’re in ninth grade. That way you have ample time to do these kinds of long one year projects while not having that compete with your SATs, other APs, and college applications, which can be pretty strenuous.
Pratik, can you talk a bit about like what you’re looking forward to? I know you said you were doing an internship at the Dana Farber Institute, but now that you’ve learned these skills, can you talk about what do you plan on doing them? And why you like research so much? And is this something you’re gonna think about doing in college or afterwards?
Pratik: Yeah, the main reason I enjoy research is because there’s no deadline to finish research. I mean there will be deadlines based on when you enter a science fair or a competition. But there’s no real pressing deadline to finish research by, let’s say next week or the week before like you have with school assignments. You can spend as much time as you want on it, and the finished product is entirely based on how much work you put in as well as how much you want to gain out of the project itself. So in college, I definitely want to do something with deep learning and AI, but it’s been said you have to go through a core computer science track first before you want to specialize into different areas of AI and machine learning. So with this internship, I think it really helps me become an independent researcher, and it also helps me get the skills that I need for college and a university setting. That will help me specialize in this track as I go forward.
Benjamin: We had one parent ask if you could mention which specific science fairs you applied to?
Pratik: It depends mainly on the region that you’re in. Since I’m in Massachusetts, I entered the Mass State science fair. Based on your region though, there will be a bunch of science fairs that you can attend. Another science fair I did was the Junior science and Humanities symposium. It’s a national science fair, but it’s based in BU, Boston University. It’s a military funded science fair, and the Department of Defense actually funds it. Nationals this year was in Washington DC, but it changes every year. It’s entirely based on where the Department of Defense wants to hold it. Next year it’s in Las Vegas, so depending on where you go, your region. Those are the two science fairs I’ve entered in. There are also a bunch of research programs that you can apply to, like the Davidson Research Institute program. There’s also Broadcom, and Stony Brook; that’s also research program.
Benjamin: I want to speak briefly about the state fair as well. Some public school or private school might have a science fair. Then the school endorses several of the participants of the school fair to go on to the regional fairs. Then the winners of the regional affairs go onto the state fair. That is a common track. But then there are some people who go to schools that don’t have a science fair. We have several students who had that experience. You can apply either directly to the regional fair, or directly to the state fair. A lot of these have different deadlines, which is the case for Massachusetts. It can be somewhat complicated because some of these contests require you to have one form for your abstract, and another form for if you have any human subjects, and another form to get approval for your overall plan. So you do have to look at which contest you’re applying to and discover what the deadlines are specific to that contest. That is something that we assist students with in the CS85 class. To be perfectly honest. it’s mainly just looking at the deadlines, contacting the different institutions, and getting the requisite paperwork. But if you don’t take the class, you could also do yourself. Call these institutions and make sure that you file things on time. We try hard in the classes to make sure that the student’s file it so far in advance that if they get rejected for some reason or another, they have a chance to fix it and then resubmit. That’s something I would recommend everyone should try to do. The downside of these contests is that most of them are only yearly. It’s not like the USA Computer Olympiad where you can do it four times a year, so if you scrub one month you can try the next month. A lot of these, if you don’t do the paperwork right, you miss the entire year. But, they are cool contests because you get to go there, see a lot of other people who are doing really cool other projects, and make some friends. It’s usually an event as well that you can just hang out.
Pratik: Another upside is, once you come up with a research project that you think is significant or beneficial, you can apply to more than one science fair. You’re not limited to just applying to one science fair; you can apply to multiple science fairs, and that really increases your chances, if anything.
Part 7. Q&A ABOUT KTBYTE CLASSES
Benjamin: Alright, that pretty much concludes our talk and it’s 8:00. I can fill in the time talking about the core curriculum while people are asking questions. The core curriculum, which are these fundamentals, and then the CS classes, they’re teaching a lot of the same concepts as if you went to a university to study computer science or take AP computer science. Concepts like variables, if statements, loops, arrays, functions, iteration, recursion, object oriented programming, polymorphism, and more or less in that order; and the classes in the core curriculum have the students build projects.
Usually they do one kind of group demo every single week and then they do some kind of independent project towards the end of the semester. Some of the classes, like these B classes tend to have larger independent projects. Especially with these orange and red classes (CS01 & CS02 series) they also come with a fairly frequent number of problem sets, which are homework assignments they have to do every week. The lower level fundamentals classes have less required homework and some optional homework. The basic idea is that it prepares students over the years to do the practice they need to really solidify these skills. We kind of compare it to playing an instrument. You want to be good at scales and you want to have good technical skills, and a wide breadth of applications to be familiar with. Because if you don’t have the technical capacity, then it’ll be a real struggle oftentimes when you try to be creative.
That’s definitely the case with computer science. When students are learning this stuff, it means that when they’re doing a high level project, they’re not worrying about ‘How do I get data from this format to another?’ or ‘How do I write this search algorithm, or the sorting algorithm, or whatever?’ and they can focus on the more advanced things. But we do it in a way that they get to make video games or audio programs, video programs or different kinds of applications related to science. So it’s not just homework assignments throughout the entire thing.
So that’s the core curriculum that Reuben was asking about. And the core curriculum is one semester for each of these classes. Usually students take them for 18 weeks, that’s usually September through January or January through June. Sometimes students take them over the summer. Then it’s like a one month compressed curriculum. Then these light color ones can sometimes be skipped while the darker ones, we generally require students to go through them unless they have some outside experience that lets them jump into a later level. So that’s the main core curriculum. You can go onto the KTBYTE website, look at those classes. They’re taught both in person and online, so you don’t have to be based in Massachusetts to be taking any of these classes. And they all have classes starting in September right now.
Here is a question: ‘If you already won silver for USACO, can you go to CS92?’ We do allow students who have already placed in gold to attend CS92. Meaning that they’ve already placed into gold, not placed into silver. If they’ve only placed into silver, we usually require them to take CS91. This is not related to AI, but the USACO has changed significantly pretty much every year in the past three years, starting with the creation of the new bronze level, the creation of the new platinum level, and then each year they’ve been tweaking the difficulties. Previously 90 and 91 were very similar, just slightly more difficult versions of each other; but nowadays 90 and 91 have pretty significantly different algorithms, and 91 and 92 have completely different algorithms in them. So we usually say that if you’re supposed to be taking the 91 level and you take the 92 level, that might actually hurt you because you’ll be prepping for algorithms that don’t even show up in the other ones.
We have a parent asking ‘What’s the difference between FUN3w, FUN3a, and FUN3b?’ W is weeklong, so it’s short; there’s usually no homework. A and B does have homework. A is for new students; these are done in processing. B is usually for continuing students who still aren’t ready for the kind of the problem set intensity of CS00a, because once you hit the CS00a, it’s kind of like the pace at the high school level; you get like a full problem set, usually 5, sometimes 10 problems every week.
We usually say if you’re under the age of high school or you’re 12 or 13 years old, we try to have them go a little bit slower; especially if they’re 10 or 9 years old. Students benefit from not rushing too quickly and doing more projects and trying to get really good marks in all of the different assignments before they go into CS00a. But FUNb is an optional track based off of pacing. W has no homework, so all the Ws are shifted left because a lot of them are just, even though they might cover the same material, they’re just not as hard because it’s only one week and there’s no homework, so it’s not easy to reinforce everything.