The Emergence of ChatGPT and its Implications for IT and HPC
In this webinar, we will introduce ChatGPT, a cutting-edge natural language processing model developed by OpenAI. ChatGPT is a variant of the popular GPT-3 model, designed specifically for conversational applications. With its powerful language generation capabilities, ChatGPT can produce human-like responses to text input in real-time, making it a valuable tool for a variety of applications such as chatbots, dialogue systems, and more. Join us to learn more about ChatGPT and our thoughts on its implications for high performance and enterprise computing.
Webinar Synopsis:
-
Panel Introduction
-
What is ChatGPT
-
How Does HPC Impact ChatGPT
-
Writing Code With ChatGPT
-
ChatGPT As A Tool
-
ChatGPT As A Tool To Spark An Idea
-
ChatGPT Results In Producing Container Files
-
ChatGPT Doing Odd Stuff
-
ChatGPT Mimicking Current Open Source
-
How Does ChatGPT Improve Traditional Chatbot Technology
-
What Is The Potential of ChatGPT In The Job Market
-
ChatGPT As A Tool To Progress As A Developer
-
ChatGPT As A Tool On-Prem Appliance
-
What Does Generative Pre-trained Transformer Mean
-
What Are The Parameters Of The Model
-
Hardware That Has Been Used To Train AI
-
How Accurate Is ChatGPT When Writing Code Or Code Definition Files
-
ChatGPT Live Demo
Speakers:
-
Zane Hamilton, VP of Solutions Engineering, CIQ
-
Justin Burdine, Solutions Engineering Team, CIQ
-
Dave Godlove, Solutions Architect, CIQ
-
Jonathon Anderson, Solutions Architect, CIQ
-
Forrest Burt, HPC Systems Engineer, CIQ
Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.
Full Webinar Transcript:
Zane Hamilton:
Good morning, good afternoon, and good evening, wherever you are. Thank you for joining us for another CIQ webinar. My name is Zane Hamilton. I am the Vice President of Solutions Engineering here at CIQ. We are focused on empowering. Let's show the next generation of software infrastructure leveraging the capabilities of cloud hyperscale and HPC. From research to the enterprise, our customers rely on us for the ultimate Rocky Linux, Warewulf, Apptainer support escalation. We provide deep development capabilities and solutions, all delivered with the collaborative spirit of open source. Today, we are going to be talking about ChatGPT. I think this is something that has been all over the internet all week, and I am really excited to have this conversation. We have some very passionate individuals on this topic, so if we could bring everybody in, that would be great. Let's go around and do introductions. Justin, I will start with you.
Panel Introduction [6:11]
Justin Burdine:
My name is Justin Burdine. I work with Zane on our solutions engineering team. I am really excited to get here and talk with you guys who are much smarter about this than I am, especially from an HPC perspective. The role I am playing today is the layman's understanding of what this thing does and the potential that it can do. I am definitely excited to hear their perspective.
Zane Hamilton:
That is great. Mr. Godlove going backwards today.
Dave Godlove:
My name is Dave Godlove. I have a background as a neuroscientist and a primary research scientist at the National Institutes of Health. I have been around the high performance computing container ecosystems sphere community for quite some time. I have helped to develop Singularity and also Apptainer. I am a solutions architect working at CIQ.
Zane Hamilton:
Thank you, Dave. Jonathan, good to see you.
Jonathan Anderson:
My name is Jonathan Anderson. My background is in HPC CIS admin and related things, and I am with the solutions architect team here at CIQ.
Zane Hamilton:
Thank you. And Forrest?
Justin Burdine:
He has the second window open. He is going to do some sharing from there, so I bet that getting some feedback,
Zane Hamilton:
We will work through it.
Forrest Burt:
My name is Forrest Burt. I am an HPC systems engineer here at CIQ. I work on the solutions architect team with Jonathan and Dave. I am very excited to discuss ChatGPT and its broader implications for HPC.
Zane Hamilton:
I know this is something that Forrest, I think you and all of us had talked about quite a bit recently. I know you are very passionate about this. I did not know anything about it until you started talking about it, and then I went and started playing with it, and it is mind blowing. Why don't you explain to everyone what ChatGPT is?
What Is ChatGPT [8:46]
Forrest Burt:
ChatGPT is a large language model based on text generation that has been developed by open AI. You have probably heard of OpenAI in the media before. It is the same company that put out the DALL-E and DALL-E 2 text to image generation models that over the past kind of six to eight months or so have kicked off a lot of the massive public interest in AI for a really long time. You know, algorithms, content, ad-serving, tracking, all that stuff has been done to an extent algorithmically via AI and those type of things. There have been a lot of these little brains that have been deciding things for a long time behind the scenes that many people may have not been aware of.
Even like fraud detection systems and things like this for transactions, there are a lot of backend AI that people are not aware exist out there. Over the last six to eight months, we have seen a revolution where there has been some very forward public facing very obvious uses of AI that have become very, very popular. One of the first ones was the text image generation stuff that came out with DALL-E 2. Like I said, that is OpenAI. They drew me a portrait of a cat in the style of Vincent Vangogh, that type of thing, which was huge. That came out as a beta research preview, which was really big. Then, it became publicly available and everyone got a hold of it. Then, there was an open source text image generation model called Stable Diffusion, which really shook things up because at that point, with it being open source, people could build it into all kinds of other things.
Here in the last three or four months has made things even crazier. Now, we have seen another side of this from the text to image generation models and perhaps more, as we see powerful and, much broader in scope. In terms of the capability you have with these newer ones, these text generation ones, as opposed to just the text to image generation ones. Back to your original question, Zane. What is ChatGPT? Like I said, it is a large language model developed by OpenAI that has been trained on a huge amount of textual content. As I said, text image generation has been around six to eight months. This is the first time with this AI revolution going on that we have seen something that generates text like this come out.
It is highly advanced. It turns out that text ends up encompassing a lot of things. Essentially anything that has some type of textual based data encoding can be hypothetically expanded upon by ChatGPT. I saw someone somewhere described this basically as anything with a grammar, essentially an organized, textual way of structuring information. It can do at least some level of generation four. It is not obviously, not conscious, not intelligent, anything like that. It is just at best a massive facsimile of a writing component of a consciousness that can with a very high accuracy reproduce what you expect it to say when you ask it something. It is not always a hundred percent accurate. It can get like I said, textual based encoding goes really, really far.
I have seen people be able to generate simple melodies in music that are compositionally like musically correct with this. I have seen people generate code with it. I have seen people generate all kinds of different things with it. Ultimately, like I said, depending upon how it is prompted, you can sometimes get it to more obviously produce incorrect output than others. In a lot of cases, to an expert in the field, it can get 85% error with some simple things. It is not perfect. It definitely takes somebody, I heard someone call it a Dunning-Krueger simulator because it is very capable of convincing you that it knows something about a lot of different things, when in fact, if you are not an expert in it, even though you might not be able to tell the minutiae that it is missing.
Ultimately in the end, it is this massive language model that is really, really good at saying what you expect it to say. While that tends to be accurate, ultimately because the information it has been trained on is by and large, accurate, it does not always put things together in the correct ways. Regardless, incredible technology that it can already do is pretty mind blowing. We have been thinking the obvious six to 12 months out, what is this going to do? That was a very long-winded way to say it is a large language model that is very capable of producing incredibly interesting and intricate output, which is sometimes correct, but is correct enough of the time for it to be a useful tool to people very broadly in a lot of different fields.
Zane Hamilton:
That is great. Justin and I like to keep lists of things that we predict are going to happen. I have a list going now of things from this that I predict. I would love to hear from you all who are watching this. What do you predict is going to come out of this? Then at the end, I am going to ask everyone else on this to give me their predictions as well. I will write them down and we will go back in a year and talk about it. From Jonathan, from you and, and Dr. Godlove points of view, how does this impact HPC and how does HPC impact something like ChatGPT? I will start with you, Dave.
How Does HPC Impact ChatGPT [14:21]
Dave Godlove:
Great question. How does HPC impact ChatGPT is what I will begin with. HPC is really integral. HPC very, very broadly speaking is integral to this type of technology. I mean many folks who are using AI, might not actually classify this as high performance computing as such, but when you are taking giant pools of computing resources and linking them together including GPUs or including TPUs or whatever it is that it might be, and you are pulling those together and you are using those to train models, I mean, that in my mind is what HPC is all about. I think a lot of people on this panel have talked before about how AI really is, created by HPC.
As far as training the models to be able to do this as far as, actually using the train models to generate this language and generate the pictures and the things that Forrest was talking about earlier. I mean, it is pretty clear how HPC is integral to that. I think another interesting question that we can start to ask now is how is AI going to impact HPC? That is a question which is a lot less well defined at present and a lot more interesting. A few days ago, one of the things that Forrest touched on this, and I would like to talk about it in a little bit more detail.
Writing Code With ChatGPT [16:07]
Maybe, I do not know if we go on this tangent right now, but I will maybe like to at least poke at it a little bit. Forrest talked about how text is a really broad category of data. There is a lot of different data which can be encoded as text, including code. That is one of the things that I have been playing around with ChatGPT a little bit, is I need a little function for something I am writing, and I could, you know, coders usually know that it is a lot easier to take some code usually, which is existing and modify it to do what you need it to do, rather than to start from scratch and write everything up. I have been playing a little bit with just asking ChatGPT, hey, in such and such language, create for me a function which does X and then turn it loose and it spits something out.
It might not be exactly what I am looking for, but I can copy and paste that into an editor and I can start working on it. Pretty soon I have something running. I asked it the other day, just on a lark, to create a Slurm submission script for an MPI job running Charm++. It made some assumptions about, it was pretty cool because it made some assumptions about specific, partitions which existed on the cluster and so forth, and resources that it had available to it. With those assumptions, it created, semantically acceptable, submission script for Slurm. You could grab that and you could say, well, I do not have this partition on my cluster, but I have another one, which I can use.
You could have changed some of the values and you could as a new person using a cluster, you could very quickly get up and running. Then my personal favorite so far is I have a little pet peeve, or not really pet peeve, but I guess, coding weakness that I never really set down and did the work to really learn about, I mean, I know about them. I have never really learned the semantics for regular expressions.They irk me because they are slightly different in every language, and there is just little, so regular expressions are something that bother me. The other day that I was dealing with one, I just plop this into ChatGPT, and it explained for me piece by piece every little bit of the regular expression.
Then I said, oh, okay, well that is not exactly the one I need to generate for me instead, one that does this. It said, okay, and it spat one out for me. It was a passable regular expression. As far as Forrest said, it is not always exactly right, but when you are dealing with something like a function that you can test this does or does not load the file in the way, in which I have asked you to do it, it is perfectly acceptable at that point to say, Hey, generate for me this thing that is supposed to do X, Y, and Z. As long as you go test it, make sure it actually does X, Y, and Z after you get the function written, then that is a great way in which this technology is going to accelerate. I think high performance computing and software development in general.
Zane Hamilton:
That is interesting. David, I am going to come back to some of that in a minute, but I want to give Jonathan an opportunity to weigh in on his thoughts on this. How is it going to impact HPC and vice president?
ChatGPT As A Tool [19:32]
Jonathan Anderson:
Well, I wrote down some thoughts here. I have been trying to figure out how to classify the usefulness of this as a tool, which was my first big perspective shift. Every time I have heard about things like this, they have seemed like a novelty to me. This was the first time it felt capable enough that I legitimately found myself reaching for it as a tool. I think it is pretty cool. I think the best way I found to describe it so far is it is a really good rubber duck debugging, pear programming thing. If you are having a problem and you want to explain it to this ball on my desk and it would be a certain level of value. I think working through a problem with a model like ChatGPT, will probably be a better experience than that.
That kind of old adage of just talking through a problem helps you solve it. I am really interested to see what happens in the space of custom training. All of the notoriety in the news has just been off this generic general knowledge trained set. One of the things, as I understand it, that OpenAI in particular is offering, is to retrain their model based on your domain specific information. I am really interested to see what that experience is like and how much information from your specific domain you need to be able to provide for it to produce interesting results. If I think about it in the context of another kind of machine learning, what we have traditionally seen, like image, image categorization, right, and you can have a model similar to this, go through a bunch of images and categorize them for what is in the images.
I am interested to see what happens if you can feed a bunch of domain specific information that is uncategorized, unsorted, not really well understood into a model like this, and then have a conversation with it to learn things about the data that is in there and, and start a research process through a conversational discovery as though you are having a conversation with someone that has gone and read all the books on the topic. I think that would be cool. There has also been some news in the past, specifically around co-generation with, is it GitHub? What is that thing called? GitHub has a co-generation bot, Co-pilot. And probably just because I do not have a habit of using a code IDE. I never really got into that, but I actually use more than the ChatGPT interface, I use the OpenAI playground interface.
That is more obviously like a text prediction thing. Instead of having a conversation, you can start some text and then hit enter and it will try to continue what you have done. You can do code generation with that really simply by writing a comment. It is like, this function does this and then enter, and it spits a function out at you. I have found myself doing that kind of thing. I do not remember if I said this yet, but, one of the guys on the team referenced it as like having a junior developer at your beck and call at all times. But there are negatives to that. I do not know Golang very well. It is new to me here at CIQ and I have been coming up to speed with it, through interfacing with our own products and the communities, which we support.
ChatGPT Used As A Pointer To Think Outside The Box [23:08]
I wanted to do something in a Warewulf template with a Golang template. I thought about, with the information that I have about how Golang works, I thought about how that should look, and I tried it the way I thought would make sense and it did not work. I spit out an error and I spent a while googling and I thought, you know what? I will ask ChatGPT how to do this. And I looked up the syntax for a Golang template comment, and I wrote a comment at the top, it is like this Golang template does this and hit enter. It basically spit out exactly the same code that I had written myself. That does not work. I thought that was really interesting that we kind of both had the same process of understanding how the language works and what would make sense to work and then tried it, but it did not get me new information. That is where I think what Dave said, that it is really best used by an expert who needs kind of pointers to things rather than trusting the answers it generates. I think that that level of brainstorming and getting you to think outside of the box is what will be really good at least in the near term.
Zane Hamilton:
I think that's really interesting. I feel like we have all tried this from a code perspective. I know Justin and I have had this conversation about it getting you some percentage of the way there, and I think you would phrase it a little bit differently than what I had but using it as a place to get an idea. But before I ask you that question, Forrest, you have done this and you have tried it with some different scripting languages, I think Jonathan, you tried it with Go, Dave, you did it with Slurm, Justin, I know you did this and asked it to write an Ansible playbook. I asked it to write some C for Arduino, and it all turned out actually pretty good. It is interesting the broad amounts of code that it can do, but Justin, you and I have been talking about it and looking at it from a writing and a speaking perspective of I am having trouble with an idea and then I will let you tell the rest of that story.
ChatGPT As A Tool To Spark An Idea [25:07]
Jonathan Anderson:
I think the statement to start with is it is really good at generating that spark of an idea. If you are sitting in a blank page, I have no idea what to write, just start talking to this thing and it will come up with responses that you can then take. I mean, I think everybody said that it gives you that point of generation. Certainly it is not something you want to actually put out there, as a final product, but it does a pretty darn good job. I did see a funny quote, this last week as people were reacting to this thing and somebody said, it is the world's greatest BS artist, meaning, if it does not know something, it will be really convincing telling you that whatever it is telling you really, is the right answer.
I think right now it is really a great tool, like I said, forgetting that or original idea. I think the anecdote you are wanting me to talk about is I had spent with my Arduino projects about a full weekend trying to come up with how I could do some controlling of a monitor. I had to look up some of these really obscure things. and it took me all weekend to finally get it working. Well, Zane was doing the same thing, but with Stepper Motors in the Arduino, and he asked if, Hey, can you write me this code that will control these Stepper Motors and have a web server and have the web server control it? And it wrote the code, and I think it wrote, it missed like three little header files, but other than that, it worked just fine. I think it is pretty fantastic. I do not think it is as doom and gloom as the headlines will say. If you are asking me kind of from that perspectiveI, I do think that it is certainly going to be the next thing. I am very curious to see where it ends up.
Zane Hamilton:
It is fantastic. We are having a lot of questions and comments put in here. Thank you for all of the interaction. Alright. The OpenAI playground is fantastic for ad hoc functions, function generation. It can also greatly, generate conveniently meaningful texts for our birthday card. Ah, interesting. Had not thought about that. Mr. Devon is giving us some great comments here with a little bit of, too many Daves. Love it.
ChatGPT Results In Producing Container Files [27:17]
Forrest Burt:
Some of the stuff that I have seen, like I said, with scripting languages and stuff like that, like all of us, I am trying to use this to automate components of what I do and things like that, and to see what the capabilities of it are. As we touched on writing Go and writing Slurm files and stuff like that. I have been looking at it from a container perspective and trying to figure out what exactly it can do around, as you noted there, Zane, like scripting languages and things like that as well. Some of the interesting results I have seen so far are asking it to produce a container file for a container that has DPU enabled PyTorch in it on Nvidia GPUs or something like that.
This was like the first thing that I ever asked it to write me it has a very, the cutoff of the information that it knows is like the end of 2021 around September or so, it does not know about certain things. It does not know, for example, about Singularity and Apptainer switchover. For example, one thing I asked it was writing a Singularity container file for GPU enabled PyTorch, for Nvidia GPU enabled PyTorch. I had to lie down for a moment when it spat out a file that was like, wow, this is pretty close to how I would do this in the end. That is not precisely how I would install the CUDA toolkit or that, but it is a way, which I am pretty sure is going to work.
I was able to take that with a couple modifications and end up with it working and building, and I am like, wow, that is really incredible that it knows enough about containers syntax to actually give a post section and Bootstrap and all that stuff. Some of the other stuff that I have given it, for example, it is very interesting how it can be kind of non-deterministic and it is interesting how you can mine results out of it. I asked it to write me a diffusion based text to image AI generation model based on PyTorch. It first off said, oh, I am just a large language model built by OpenAI. I cannot do that.
ChatGPT Doing Odd Stuff [29:42]
I reset the thread and put the same request back in and lo and behold, oh, here is your diffusion based text image AI model written in PyTorch and gave me like 50 lines, 60 lines worth of code for it. I was like, wow, that is very, very impressive, so it is interesting how it can be nondeterministic and sometimes I have seen people say oh, well, I think a large language model should be able to do that, so give it a try. It is kind of interesting how the problem space that it understands is not absolute and it can be non-deterministic. What it actually ends up resulting in, it can do really, really odd stuff like simulate a Linux terminal because ultimately a Linux terminal is just a text-based interface. You can encode all of that information in such a way that it can understand it.
ChatGPT Mimicking Current Open Source [30:29]
Somebody out there figured out a magic paragraph to give it that basically gives you a Linux terminal prompt. I was taking that and putting in, like bash vulnerabilities into it. And I found that you can give it a fork bomb and it will basically say terminated as if it has gotten rid of it before it could explode and halt the system. You can also give it, for example, one of the initial shell shock CBEs, and it will actually print and say the system is vulnerable, but they give it one of the later ways to test like two of the different shell shock CBEs. It will give you an output that indicates a patched system. It has been interesting to me to see that this initial command when this command is posted on the internet, it seems like most often its results are associated with what is printed for our vulnerable system. That is what OpenAI gives you if you run this other command that it seems is mostly associated with getting a patch result. If you go look at the Wikipedia page for these, that is the result that shows output from them. You get from this other one, an indication of a patch system. It is not absolute, you can definitely find the kind of poke holes in how it knows things, and you can definitely find different ways that it is not actually thinking about things, but is just kind of mimicking what it's seen out there.
You can tell that for all the intelligence it might appear to have with some of these things, there are still odd spots in the problem space where it is not really doing what it seems like it is doing. It is just a very, very convincing fast model. You can also get it to play Zork to an extent. One thing that is very different about OpenAI versus these other models is that it has a concept of memory. There have been other chatbot AI in the past. I do not find it accurate to classify this as a chatbot because it is not just that there is a lot more that it can do than just, you mean this is not 2011 and this is not a clever bot or something like that.
How Does ChatGPT Improve Traditional Chatbot Technology [32:32]
I may have lost my train of thought in saying that, but I was addressing one of the questions that we had along the way. How does ChatGPT improve on traditional chatbot technology? Like I said, it actually has a concept of memory, so you can actually refer to prior things in the conversation and it will be able to continue to expand upon them. You can say what about, for example, I just fed it in a container file, like 50 or 60 lines long that I had written, and it gave me a whole analysis of what this container file is doing in each step. It is probably installing this, it is probably installing that. Then, I asked if this container file appears to be GPU enabled?
Without having to respecify the container file, it just knows that I am talking about this; it should not be a prior part of the conversation. One big unique thing about it is that it has this concept of memory, and I may have had an original point that I was trying to link that back to, but it is very good to know that one of the biggest differences is it has a concept of memory and it, for all the mimicking that it is doing, it can still expand upon itself in the context of one conversation. It is very interesting. It is simultaneously extremely powerful, but in the context of one conversation, you can get it to show you a vulnerable system and a patch system for the same CBEs. It is interesting to find the odd holes in its capability. When it gives you the Zork output, there is actually a serial number on that Zork output. If you find the whereabouts of that serial number you can start to fingerprint data sets and that. But I am concerned, I will stop talking less about OpenAI as research.
Zane Hamilton:
Several comments coming in. I know Andre has a good one. I was wondering about if the team could speak a little bit about the potential impact of ChatGPT job market. I think Forrest, Justin, I know we have talked about this quite a bit, and as Dave mentioned, it is currently close to a junior developer, and that is just in preview. I am going to go back to Dave, since you started this topic. What do you think the potential is?
What Is The Potential of ChatGPT In The Job Market [34:50]
Dave Godlove:
This is not a topic of discussion which is unique to this particular technology. There was a big piece that came out, recently I believe in the Atlantic that was asking if ChatGPT was going to kill the college essay. Because, now you are basically just handing all your prospective college students a way to plagiarize their entire essays. You are making it much, much more difficult for the software that detects that plagiarism to actually figure out that you have done it. I can only answer this question in terms of other types of technology, you know really anytime anything new comes out, there is a fear that it is going to make one group of workers obsolete and it is going to put a bunch of people out of work.
What you find is that the new technology typically enables new types of work that were previously unimaginable. Then those new types of work, create a bunch of jobs and open up a bunch of new things and so on. I do not think it could, it might be the case that this type of technology ultimately ends up displacing a lot of people. I think it is just going to change society, right? In the ways that lots of different technologies change society and it is going to end up creating new jobs as well as moving people from one place to another. I do not know what that is going to look like yet, but, my feeling is that that is probably what would happen.
Zane Hamilton:
As you are saying this, I am thinking about Andre's question, to get really good at something, you have to either do it a lot or go to training or go to training and then do it a lot, right? To become even a junior admin, there is a lot of effort in that. You have to do a lot of things. You have to try to fail and repeat and try. How could this actually help make that progression faster? Like taking you from a junior admin to a junior developer to a more senior developer because you can give the examples that you need very quickly so that you are actually learning the right way or you are learning in a way that actually works instead of having to go hack around on the internet for, like, Justin and I were doing for weeks to find something simple to do. I can do it in minutes. So maybe, does it help?
ChatGPT As A Tool To Progress As A Developer [37:19]
Justin Burdine:
I think so. I mean, I think one of the cool things about it is that you can have it explain things. I think Dave's admission of struggling with regular expressions. I mean put something in there, explain this to me, having that at your fingertips and being able to get something that is written, I even saw some the other day starting to prompt, to explain it to me as if I am a five year old. They take this big complicated output and they go, okay, great, that is awesome. Explain it as, as if I am five. When that thing can just spit out the same information, but much more digested so that people can learn at that level, it just gives them, I think, a much faster ramp up. Because you are not having to necessarily start from the ground up and build on those. I think there is certainly a deeper understanding, but I think the advantages of having this is just new and different and I think there will be fears out there of all sorts of things, but I think it is definitely, certainly interesting.
Zane Hamilton:
That is great. I am trying get to the comments. Perfect. Dave asks, do you think OpenAI will ever sell a dedicated ChatGPT appliance, that will come with a massive training set of data, and have continuous updates if a federal customer who would, who likes such things as on-prem appliances?
ChatGPT As A Tool On-Prem Appliance [38:37]
Forrest Burt:
I think this kind of goes back to what Jonathan was talking about with the offerings around retraining that OpenAI has. I think OpenAI is probably going to be pretty tight-fisted of any of their model code or anything like that that people can potentially hack at them with an appliance or something like that and go try to put elsewhere. Now, granted that would be pretty bold, but the level of sophistication that it already reaches with the training that it has is more than likely going to be augmented. Like as Jonathan said, most effectively as related to commercial business with the code bases and documentation sets. Who knows, maybe even copies of message boards or stuff like that, places, I anticipate, will retrain OpenAI. You can already pay them to retrain some of the other models that they have, like GPT-3 on custom data sets that you have.
It is capable then of generating, custom responses based on what has been retrained on ChatGPT at the moment is just a research preview obvious there is nothing like that. I think as far as an appliance goes, mostly this is just an inference task. I think in the end, less so than like a physical appliance, they will probably end up offering this as some type of like what they already have existing, like a software as a service model where you give them basically your data sets, they retrain it and you end up getting served out, some type of portal to a custom version of it that has been retrained. Whether or not that covers the on-prem clients, because obviously, if we are trying to do this air gap or somewhere as you note they are in a federal environment, software as a service model might not work. I envision that, if the demand becomes high enough, they will probably try to find a way to lock down something like that and provide that type of air gap access. but for the moment I anticipate they will probably be looking mostly at their existing well-established software as a service model for that type of thing.
Just to note on that as well, because there is value to that because it does not have a dynamic way at the moment to learn new information so it cannot go out and learn about things it does not know about at the moment. It cannot tell you specific things about certain people or large figures in tech or celebrities, whatever. It has no dynamic ability to relearn from the internet or even search the internet at the moment. So continuous updates will somehow be a way that they will have to serve that out. I envision on their side, they probably have the architecture to do that type of retraining effectively to continually reintegrate the latest in current events and the latest in science and things like that. We just achieved nuclear fusion and things like that. There are things that ChatGPT needs to know about continuous updates. I imagine that will be a part of it as well. I am not sure how they would do that in an on-prem environment. I imagine they will probably have solutions for people to work that out in those types of specific air gap ones. For the moment, I imagine the continuous updating once this becomes a publicly available thing that you are paying for will be something they handle on their side.
Zane Hamilton:
Tronix has a question. What does generative pre-training and generative pre-trained transformer (GPT) mean? Go ahead, Dave.
What Does Generative Pre-trained Transformer Mean [42:04]
Dave Godlove:
I asked the chatbot that question because I did not know either. I will just read it to you. It says the term, generative pre-trained transformer or GPT refers to a type of artificial intelligence model that uses un blah, blah, blah, blah, blah. It means that the model has been trained on a large data set of text. So the generative part of the term refers to the model's ability to generate new text. It is just generating stuff. While pre-trained obviously means that it has been pre-trained on a data set already.
What Are The Parameters Of The Model [42:42]
You had another question too, Tron W that I thought maybe we could address a little bit. You were asking about the parameters of the model. I am actually unsure if this is a deep learning, deep neural network model. I do not know. Forrest, if you might know that already, if this is a deep learning model, I assume it is because most of the cutting edge really crazy intelligent AI these days is deep neural networks or deep learning models. In the context of these now, the words themselves are the phrases or the training data would not be the parameters. The parameters would be things like these neural network models, a lot of folks have talked about them a lot. I am by no means an expert, as far as the software goes, but, basically there are these arrays that they have, and each one of those arrays is one of the layers of the neural networks.
Each one of these arrays is connected then by point to point connections, which go from one point in one array to all the different points of another array. The major parameters are things like the strength of those connections. What ends up happening is you give input to one point of the array. I think the easiest way to explain this is in the visual models. You have a lot of models, models of these type will recognize objects. What you can think of a picture as a bunch of different points. They are pixels and each one of those pixels has luminance values. Maybe, they have individual black and white luminance values, or maybe they have three luminance values, one for R, B and G, red, green, and blue.
Each one of those ends up being input to one point within the input layer of the model. Then that one point then projects to areas of the next points within the next layer of the model. The fact that you have multiples of these makes them deep. What you do during training is you adjust the strength of how much does one point drive another point deeper in the model, and how much does that point drive the activation of other points? That is what is actually being adjusted during training. Then what you end up with is this network of all these different layers and sometimes they can be very complicated, in how the layers interact with one another. At the end, you end up with all these weights, which are the parameters within the model, which have all been trained.
Then you can just put data in and get data out. That is another cool thing that goes back to your earlier question about pre-trained. That is another cool thing about these models is all the really crazy computational work goes into training the models. Once you have these models trained, and that is where you are updating weights and you are doing all this work and you are giving it tons and tons of data and getting tons of output, and you are going back and based on the output, you are going back and retraining the model and so on. When you are done, you end up taking a snapshot of this thing and then freezing it in space, and then you can deploy that out to hardware, which is not really that crazy. All you do is you just feed it input, you have got all the weight saved and you just get the output. That is one of the things that makes these pre-trained models so awesome, is that after they are trained, they are actually pretty lightweight and can be deployed out to, hardware, which is not that high powered and can work pretty well.
Hardware That Has Been Used To Train AI [46:46]
Forrest Burt:
That is great. That is basically how it is, and I will touch on Sylvia's question as well in the context of that. That is exactly how it is Dave. Computer vision is a great example there because there is a very easy one-to-one mapping between the layers that you can see in it. Being a transformer based model, GPT is definitely a deep neural network. I am not an AI engineer, I cannot speak to the difference between the recurrent neural networks, transformers, LSTM, all that different type of stuff. What I can say is like as Dave said there computer vision is a fantastic example because you can model computer vision models is what is called a multi-layer perceptron, which is essentially like a very simple form of AI model where you have input layer, some number of hidden layers and output layer as they have noticed the input layer might map to the RGB to the initial, RGB values of all of the pixels in an image.
Say you have a 500 by 500 size image, which you are trying to do some computer vision on, all 500 by 500 of those pixels would be, hypothetically, an input or a neuron in your input layer. The computer thus is able to see the image by having all these initial neurons that tell it all of the initial RGB values as it passes through the model, it might learn to detect, say you are trying to build a model that detects humans versus cat faces, it might learn to detect as you do back propagation as Dave said, you show it something, you adjust the weights of all the neurons in the model as it gets close. Through that process we end up training it.
You might be learning through the hidden layers to recognize the distinct edges that are associated with a certain type of face. Then the distinct like sub features of a face that are what can tell it what type of face it is. The output layer is going to be just two neurons, cat or human, and that is going to be the entire output. You have all these input layers whatever the math on 500 times 500 is, it is 250,000 I think. You have 250,000 input neurons. You have all these hidden layer neurons and then you just have two outputs potentially. What kind of hardware do we use to do this? Well, we normally use the GPU as we have expanded on quite a bit. There are now other things that are coming out to hopefully replace the GPU.
We use the GPU in this because essentially all of the operations that go into doing these weights are incredibly simple mathematical operations. They are essentially a long series of multiply some operations where you have some chain of numbers, they all get multiplied by something, they all get some back together which mean they are incredibly simple, very pipeline mathematical operations because with computer vision we are quite literally processing all the pixels on the screen in the same way. Obviously, with other cases we are doing the same thing, these simple mathematical operations, but there is a very easy one-to-one mapping between mapping pixels on screen with the GPU and seeing pixels in a computer vision model. The cores of a CPU probably only have 16 to 64 of those. Most machines are highly complex, highly pipelined, they do a lot of different things other than just math.
CPU cores are incredibly complex, but ultimately you are only going to be churning away on essentially the number of neurons, as a number of cores that you have. It is inefficient in the end to use CPUs for this type of training because you are essentially having these incredibly sophisticated cores doing these really, really basic multiply sum operations on whatever 16 to 64 cores you have. If you go to a GPU, you have thousands of cores available on it because it is meant to process thousands of little tiny pixel operations. You can land all those neurons in that back propagation. That training on these GPU cores that are much, much less individually complex than a given CPU core, but are still complex enough to do really, really rapidly. These will multiply some operations, which are important for AI training.
We use GPUs ultimately because they have a really, really large amount of really, really small cores available that just do the exact amount of math, which we need them to do. There are thousands of them on any given GPU versus say 32 cores on a CPU that are going to be inefficiently used because they can do a lot more, a lot faster than just multiplying some operations. That is all you are going to be using them for if you are trying to train on a CPU. This is why we have all these new chips coming out in AI is because GPU's are great for this and they were easily repurposed into AI training devices, but we are finding that GPU is not the end all be all. We can actually build silicon that has way, way, way more of these little simple cores on it and that can replace the GPU and be more optimal there as a purpose defined solution.
Because all you need is just a bunch of little tiny execution units away at those really basic operations really fast. That is ultimately why we do that. Stable diffusion was trained on 32 eight times a 100 boxes. So, 256 GPUs is what I took to train stable diffusion. I can only imagine what it took to train ChatGPT because I think this is probably a more complex model. There is a lot more textual information as we have noted. There is a lot more information you can encode as text in the end that it has to know and has to be able to make connections between. Ultimately, I can only really theorize what the actual hardware this is run on, but I imagine if we look at stable diffusion and it taking 32 8 times a 100 boxes, it is probably significantly more for ChatGPT.
How Accurate is ChatGPT When Writing Code Or Code Definition Files [52:31]
Zane Hamilton:
There is one more question I want to jump to real quick and then Forrest, I know you want to show us something. I know there is a lot of chat going on back and forth about who owns it, but there was one that was talking about how accurate it is. James asked how accurate is ChatGPT and GT when it comes to writing code or code definition files? I think we have covered that a little bit, but I want to go back to it real quick. From what I have found, we are talking like 90%, 85% to 90% there. This to me is where it becomes important for having someone who really knows what they are doing, look at it to get it that last 15%, 10% of the way there. That is what I have found. I assume that is probably similar for everybody else, which goes to it. It may be difficult to say it is going to replace somebody's job because somebody is still going to have to look at all of this and make sure that it is right.
Dave Godlove:
I would definitely not use it to do the final steps of an analysis for a paper and give me the p-value as to whether or not my hypothesis is correct. That would be a really bad use of this unless you went back and you actually did the math and looked at the statistics and made sure that they did it properly. That kind of thing would be a very, very bad use. To me, the best use right now is spending half a day to ingest a bunch of data that is like movies of something, which I need to pull them all in as frames and then break them down into some color space. I could spend half a day looking up the way to do all that stuff and do it, or I could just ask you to do it.
Can you just do it for me? Because then if you, if you give it something like that, you, it is very, very easy to test. Then you have a movie, you give it to it and you look through if it actually broke it down into individual frames and if it actually put it in the color space that you wanted it to be. That is the kind of stuff that you can easily check. Stuff that is just bookkeeping boring code to write yourself. You are not too worried about whether or not it is going to be right or wrong because you can easily check it to see if it is right or wrong. Maybe you are rusty in the language you are working in and it is just going to take you a little while to go through it and figure it out. That is the kind of stuff I would use it for right now. Not stuff that you are like, oh, okay, great, I am not going to go back and check this ChatGPT says it is correct. Let's submit this paper to nature. Probably a bad idea.
Forrest Burt:
I want to be leery about the verification thing. It is easy to have a whole team of people working on something and then all of a sudden when you have something that has all the skill of all these programmers built into it, to suddenly not need all those people and just need a couple of them to verify things. I saw this metaphorically elsewhere in a couple of places recently with advancing technology where there was once eight people to verify something, but now there is just one person because of where the tech has come out this week very relevant to this. Like I said, I saw, I will not expand upon, but an example of where there was once 48 people at a spot to verify certain credentials, but now there is just one person and there is like an automatic electronic gate basically.
The final thought that I would have is about the replacement of jobs and stuff. There is a lot of verification that has to be done with it. Somebody still has to look at it. You can still end up doing a lot of harm in the end with automation, because the lift at that point is you just need the best guy, but somebody told me it would probably end up being the cheapest at most places. We do have to be careful there. It is incredibly powerful, powerful technology. We do not really know, this is just a research preview. We do not really know what this is going to do. Once people can retrain this on their own data sets and all that type of thing, we are already seeing what I view to be negative effects of the text image stuff out there. It is just important that all of us remember that this is very new ground for all of us. This is really something that has only come out in the last couple of months. Watch out, look out for your fellow humans and remember it is just a machine.
Zane Hamilton:
All right. Show us something.
ChatGPT Live Demo [56:54]
Forrest Burt:
Really quickly if we can add that other board up here. We have ChatGPT joining us here today. We have a question that is coming from the chat. The public one about a certain Sed command randomly. I think this was in the context of looking at regular expressions and things like that. Just live, we are just going to see what happens when we run this. It might completely blow up. It might work. We will see what it does. I have confidence we are not going to get anything that is going to be too terrible so into the future.
Zane Hamilton:
It looks pretty benign
Forrest Burt:
If it is going to work at all or if it is severely under load at the moment. Oh, here we go. Okay, here we go. Hold on, zoom in a little bit.
Dave Godlove:
I always thought if I had a couple of dogs I would name one Sed and the other one Och.
Forrest Burt:
We can see that we are getting quite a bit of output here. The -I flag tells Sed to edit the file in place, the eFLAG tells it to use regular expressions and we can see the regular expression itself is quite complex, but it can be broken down into several parts. There we go. You can see it, as Dave mentioned earlier, with analyzing regular expressions, it just tells us straight up, this is a flag that tells Sed to ignore the case. This means that it will match Dave. It is looking forward and figuring out what text it is actually going to be parsing out of this regular expression. Then, here at the end we get this, the Sed command will search for strings that match the entire regular expression or placement with a modified version of the string. This case, da da da da da. It even gives us an example of what that RegX is going to do. If you are not familiar with regular expressions, you may want to consult a reference or tutorial to learn more about them.
Zane Hamilton:
Entire books written on it.
Dave Godlove:
Thanks. Thanks a lot.
Forrest Burt:
That is kind of funny, just a real world example of what this looks like. If we had another, and we have a comment from the comment saying it is not correct syntactically, and maybe there is something incorrect about that. This is perhaps an example of the Dunning-Kruger machine here. This is perhaps an example of artificial intelligence genuinely.
Justin Burdine:
Can you ask it? Who owns the copyright of the content that is generated? That was one of the questions we had come in and I would be very curious to get it, get its opinion.
Zane Hamilton:
So Dave did this. So I will be curious to see if we get the same thing twice.
Forrest Burt:
Does that capture the essence of your question, Justin? Okay. Who owns a copyright of the content generated during this research preview of ChatGPT? It will either say it does not know what ChatGPT is or it is going to give us an actual answer. Sometimes it is interesting because sometimes it knows about itself.
Justin Burdine:
When I asked it, I said, who owns the copyright of content generated by ChatGPT and it said the content is owned or that is generated by the machine learning model including ChatGPT belongs to the creator of the content and in most cases, that is the person who trains the model. So very interesting to see it is the BS artist at work.
Forrest Burt:
It definitely has a varying response to things we talk about. You know, jobs this might create, prompt engineer is a perfectly legitimate way of looking at this because it just takes an expert to analyze these results for accuracy. It sometimes takes an expert to query this. I was trying to get it to build some bioinformatics code, but I found that my prompt was a little bit incorrect from the start, like the assumption of what one piece of software was going to do. When I took this code to a couple of the bioscience people like Dave and another person in CIQ, Glen, well, you do not quite use this tool for that type of thing. It is odd that even though this was an incorrect thing to ask it for, it just gave you this, oh, here is how you do it with that. so there is definitely a lot of variability.
Dave Godlove:
I would like to point out too that this is an excellent example of what I was talking about earlier, so that we have two different questions here. The first of which I would say if you needed that regular expression to work after you asked it everything that you needed to know about the regular expression, the next thing to do is to go to your terminal and test that regular expression in as many different corner cases as you can come up with. If it checks out, you should be happy with the regular expression that was generated. The second question is obviously one that you cannot just stake your claim about some copyright and go to court over it. That would be the real way in which to test that. Probably you do not want to do that. There is a good example of good questions and bad questions for the chatbot at this point.
Zane Hamilton:
Ask her real quick to create a container definition for something simple.
Forrest Burt:
Okay. I just want to plan really quickly. There was a little bit more that we missed there. It does say if you are the human author of the content generated by ChatGPT Research preview, you are the owner of the copyright, you have exclusive, you know, blah blah blah, publicly done. So, kind of interesting. Just wanted to show, it did give us a full response out there. That was a little bit more complex. It kind of touches on a couple of things. I know, really quickly, Zane, what did you say? Build a container definition file?
Justin Burdine:
It is interesting, mine ended, mine was a much shorter answer about copyright law. It ended by saying, you should probably consult a lawyer, which is probably a great way to caveat its response.
Forrest Burt:
GP unable copyright when thinking for GP enabled or video GPU enabled High Torch inside of it.
Dave Godlove:
Glad you chose a simple one.
Forrest Burt:
Well I can do it.
Forrest Burt:
Oh, I think we might have nuked it.
Justin Burdine:
Hit the end.
Zane Hamilton:
That is happening. You would get halfway through writing something and it would network error out and I would have to go back and do it again. It happened several times
Jonathan Anderson:
In the meantime, while we are waiting for that, my experience so far has been talking about it answers very confidently and very positively and much to very similar to it generating the same Golang code that I tried to generate assumptively, but it was wrong. I just asked it, what does this bit of Python do with a Python 2.0 style Hello world!, but running it with the Python 3 Interpreter and it is like, oh it prints Hello world! But of course that is incorrect because Python 3 changed the syntax for that. It does not know when things are going to go wrong. It cannot evaluate the correctness of it. It just puts something up there that looks right.
Forrest Burt:
I think we can just take it down one more time really quickly for that reason. Logging out and logging back and then work. So give me just one second, everyone.
Dave Godlove:
I'm trying, I'm trying the same thing on my end and I'm getting the same thing.
Justin Burdine:
Did you guys see how fast this thing got adopted to or got to a million users? DALL-E took two months to get to a million users. ChatGPT took five days. That is just pretty astounding and every time I am on it, running something out, inevitably within some session I am doing it will basically say it is overloaded. A lot of people are quite interested in this.
Forrest Burt:
It looks like even on a fresh, clear history and re-log in it is telling me that their servers are overloaded. We have the entire definition filed. It would have been very impressive to see. We have another RegX here in the output.
Dave Godlove:
They probably have us to thank for that, right? In this webinar we increased their popularity.
Zane Hamilton:
Probably we did.
Zane Hamilton:
We are up at the end of time anyway, so I appreciate it. I am not going to say that there will be a blog about this, but it might be interesting if Forrest would go write something up real quick and do the definition of the container definition and throw it out there of what it generates just so people can see, so we can follow up. So I really appreciate it. Well thank you guys for joining us this week. It is an exciting topic. I know we could probably just keep talking about this the rest of the day and kill a bunch of time because it is super interesting. We appreciate you guys joining. Thank you for all the interaction. If you would go like and subscribe if you enjoyed this. And we will see you, probably after the holidays. I do not think we are doing one the next two weeks, so we appreciate it. And thank you for joining us.