Research Computing Roundtable: Programming Trends in Computing
Next, in our Research Computing Roundtable series, we will discuss programming trends in computing. Our experts bring a wealth of knowledge and are happy to answer your questions during the live stream.
Webinar Synopsis:
Speakers:
-
Gregory Kurtzer, Founder of Rocky Linux, Singularity/Apptainer, Warewulf, CentOS, and CEO of CIQ
-
Zane Hamilton, Vice President of Sales Engineering, CIQ
-
Glen Otero, VP of Scientific Computing, CIQ
-
Jonathon Anderson, HPC System Engineer Sr., CIQ
-
David Godlove, Solutions Architect, CIQ
-
John Hanks, HPC Principal Engineer, Chan Zuckerberg Biohub
-
Gary Jung, HPC General Manager, LBNL and UC Berkeley
-
Alan Sill, Managing Director, High Performance Computing Center at TTU
Note: This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors.
Full Webinar Transcript:
Zane Hamilton:
Good morning, good afternoon, gApptainood evening, wherever you are. Welcome to another CIQ Webinar. Another Research Roundtable Webinar. This week we are going to be talking about programming trends in computing and software development. I would like to bring in our panel. There we go. Welcome back, guys. I think everybody has met you guys, but I'm going to go ahead and let everybody introduce themselves again. I'm going to start with you, Gary.
Gary Jung:
My name is Gary John. I work at Lawrence Berkeley National Laboratory. I manage the scientific computing group where we provide the institutional HPC for researchers. We also manage the HPC program for UC Berkeley.
Zane Hamilton:
That is great. Thank you. Dave?
Dave Godlove:
Hey, everybody. I'm Dave Godlove. I am a solutions architect at CIQ. I used to be a neuroscientist at the NIH and I have been in the HPC community and also in the Apptainer community for the past several years.
Zane Hamilton:
Thank you. Glen?
Glen Otero:
Glen Otero. I'm the Director of Scientific Computing at CIQ. I have been a scientist and a cluster builder for researchers for the past 20 years. And now I'm at CIQ.
Zane Hamilton:
Thank you. John?
John Hanks:
John Hanks. HPC assistant man at CZ Biohub. I have been a assistant man for a
very long time.
Zane Hamilton:
Thank you, John. Jonathan?
Jonathon Anderson:
Hey. I'm Jonathan Anderson. I'm an HPC engineer with CIQ and Solutions Architect. I have also been an HPC systems administrator in many past lives and I have never been a research scientist.
Zane Hamilton:
Thank you, Jonathan. Greg, thanks for joining.
Gregory Kurtzer:
Hi, thank you. Hi, everybody. I'm Greg. I am the founder and an open source guy. I have spent most of my career doing science and scientific computing.
How Has the Research Development Community Changed? [2:33]
Zane Hamilton:
Thank you. Let's jump into this. I know we were talking about this yesterday. I wish we could have recorded some of it because the conversation was really good and went in some ways that I didn't expect it to. I want to start off and just ask you guys, and I will start with you, Gary. How have you seen development in the research community change over the last 5, 10, 15 years?
Gary Jung:
Oh boy. Things have changed dramatically. We started the program here over 20 years ago at Berkeley Lab. If we go back 15-20 years, we were still seeing Fortran code that would run on large supercomputers like the IBM SP's for example. Then we started to see a transition to more standardized MPI codes. A lot of closely coupled MPI codes written in C. In the last four years is a dramatic shift over to Python. That really shows up on the questions that we get and also how it hits the system.
Zane Hamilton:
Thank you. John, I'm going to pick on you now. Have you seen things change in the last 15, 5-10 years?
John Hanks:
A hard time getting off mute. Life sciences software over the last 20 years moved from Perl to Python. Maybe ahead of other things that move to Python. There hasn't been a whole lot of shift recently as far as the languages that people are using. It's mostly R and Python wrapping a bunch of binary applications that seem to not ever change over very long periods of time. The thing that jumps out at me on a day-to-day basis is that we are approaching a point where the number of developers is larger than the number of researchers. It's becoming less about research and writing software for research and more about developers bringing enterprise type workload tools and stuff into the research space.
Zane Hamilton:
Interesting. That is something we will get into a little bit later, John. I would love your take on that, but I'm going to keep going around. Glen, you have been a
researcher before for a while. What have you seen change over the last 15 years?
Glen Otero:
The switch from Perl to Python has definitely been a big obvious trend. I remember being in graduate school and Python didn't exist. A lot of things have changed for me since I was a scientist. I first worked on a VAX system, so all this is pretty new. Trend wise though, it's been more Python and R. Really the higher level languages making it easier to get work done, but not necessarily better software practices. Right. I think GIT has become a bandaid for some, a crutch for others and a way to think that their work is reproducible and transparent and whatnot. I think that it is still not taken seriously enough in the research space. Again, just speaking for bio researchers.
Zane Hamilton:
Got it. Before I change topics or keep diving down this hole. Anybody have anything they want to add to that? That they have seen change? Great, Greg. Always absolutely.
Gregory Kurtzer:
I talk a lot, I know. When I first started getting into high performance computing, I have talked about this before in our round tables, but high performance computing meant something very specific. It was very tightly coupled, highly paralleled applications that were highly specialized for running on, I almost want to say non-commodity type systems, but they were built out of commodity systems. There was a lot of really tightly coupled, very scalable, very large capable applications. Now, we used to call this the long tail of science. The long tail of science is starting to grow and starting to increase. We are starting to see more and more different types of use cases running on these HPC systems. I have had to redefine HPC in my own head as to what does HPC actually mean?
I went from being so specific that HPC was this idea of very tightly coupled applications that were very scalable to it's anything that is going to spin your CPU and generate heat. Generally, I think of it as jobs as opposed to services. What have we seen changed? We have seen a transformation coming from very specific applications developed for tightly coupled and paralyzed runs going and spanning a much wider discipline of not only science and types of jobs, but becoming much more prevalent. I never thought I would see the day in which enterprise organizations would start asking me, "how do they go build an HPC cluster." Right? Things have changed a lot over the past few years. I definitely agree with the points about some of the programming languages and whatnot changing.
I would also say containerization has changed as well. As well as the interest in orchestration. We are starting to see big centers that are all of a sudden starting to talk about how we integrate services with our compute resources? How do we run Kubernetes? How do we integrate? I talked to people that are like, "how do we integrate VMware into our HPC stacks?" We are starting to see a transformation. Anne Algara who used to be an Intel fellow, I think said this really well, which is, "due to machine learning AIML compute and data driven analytics, coupled with simulation, we are starting to see some cross pollination, which will end up with systems of much greater capability and class." I think that is definitely happening. As a result, I think the systems and the infrastructure that we are talking about in terms of building, managing, and supporting these application sets is also transforming in order to better support those types of workloads.
Gary Jung:
I had something.
Zane Hamilton:
Absolutely. Gary.
Gary Jung:
Yeah. Just to add into that. As we have seen the shift to more Python, a lot of it is this long tail of science. Most people think of life science codes, but it's more than that. We will see people in, say, for example, archeology who are using the HPC systems to do photogrammetry to convert 2D images into 3D. Inside of a, say a tune, for example, or maybe with a patent office. They want to do OCR scanning of the patents and digitize it, but just the sheer number of patents that they want to scan makes it an HPC task. We are seeing it's starting to reach a lot further than I thought. Then I would initially think of that first take.
Zane Hamilton:
It's very interesting.
Gregory Kurtzer:
Even archeology, huh? Gary and I started seeing a whole bunch of things like political science, wanting resources on the computer systems, library science, wanting resources on the computers, and now archeology. That's the first time I have heard of archeology on an HPC system. That is tremendously awesome.
Zane Hamilton:
That's very cool. Jonathan, I know you wanted to jump in here.
Jonathon Anderson:
Yes. We have talked here and in other meetings about this idea of the broadening of the long tail of science and how that is affected HPC. Something I was thinking about just as, I think Gary you were talking, was that it's also been broadening on the on-ramp too. That really leans into the Python ecosystem where we have seen like ScientificPython, SciPi, MPI, Pandas and all of that, kind of no I won't say supplant, but at least take over some of the use cases that people might have had to use MATLAB for in the past. The availability of that ecosystem makes it easier for people to get in. The applicability of that to a cloud environment also reduces some of those barriers to entry. People can come in with that same skill set. Whether they were doing something small on a cloud system or on a university HPC system. Also, once that expectation is set and this comfort with web services and the idea that you have an interface that is backed with something more capable. Where I saw that at my last position was just how much work people were getting done with Jupyter Notebooks.
I presume that is inspired a lot with experiences from our studio. I'm really encouraged to see that on-ramp open with transitions to more open code. Open technologies like open source software and freely available software. How the incentive structure around cloud computing has incentivized getting more people on-ramped more quickly. As much as we can take that and apply it to HPC, the better we are in that space too.
CI/CD in the HPC World [12:41]
Zane Hamilton:
Thank you, Jonathan. Jonathan, I do have a question for you. I know we have talked about, and Greg brought up earlier, we talked about containerization and more modern design pattern. That type of thing. How much are you seeing CI/CD and if you want to define CI/CD as well. How much are you seeing that in the HPC world?
Jonathon Anderson:
Yeah and as we discussed, we will broadly define CI/CD. Continuous integration, continuous deployment. Basically I would generalize that as any kind of structure around building, testing, and making your software available rather than, I built and tested it on my local workstation. I will say I saw that in the systems side. We were using that to build node images and we talk about that a fair bit as something that we would like people to be doing with Warewulf. I don't really have a good sense of how much that is happening in that imagined graduate student research code development space. I would look to, frankly, probably anyone else on this panel to see if they have direct experiences with that. I know that is something that there's been a lot of value in on the enterprise side, on the web and hyperscaler side. I'm curious if the current batch of students are coming in with that expectation built in and with that framework in mind already. As a result, are building codes that way or not?
Zane Hamilton:
Yeah, a little bit. Anybody that wants to answer that, but I will pick on Dave first just because Dave has been in that world.
Dave Godlove:
Sure. I mean, partially. Oh, excuse me. Partially in. To answer that question and partially to go back to your original question, how are things changing or how have they changed in the last 10 to 15 years? I mean, I can give you my view on that, and that is I could just kind of go through the journey that I have taken from being an undergrad student all the way up to where I'm at now. I could use that to illustrate how things are changing. I think, when I first started as an undergrad I did some research. As an undergrad we would gather data and we would enter the data and we would bring it to the Pi and the Pi had SASS. He was the magic person who had SASS and knew how to analyze the data and get some statistics out of it.
We would watch him do that and so then when I got into grad school, everybody was like, "well? Do you have any programming experience specifically? Do you have any, any experience with MATLAB?" In undergrad, nobody had ever told me that I would need to have some programming experience to be in neuroscience, right? That was the advice that I used to give undergrads as a graduate student. I was like, "take some classes in programming, learn some basic programming." I think that is ridiculous advice now. I don't think you would have to go tell an undergrad, "hey, you are going to have to learn how to program." Because I think most undergrads who are interested in pursuing some sort of research career are going to be aware of that.
I think they are going to know that. Now, all through grad school, I didn't really have any. I mean, it was all just MATLAB pretty much. It wasn't until I got out into my postdoc that it was like, "oh, here, do some stuff at the Shell here, use Bash. Here's Python. Here are some other programming languages that you need to at least be able to read a little bit of one of your colleagues' code, understand what's going on and so on. Then I went from there into the HPC environment where I was exposed to lots and lots of different languages, technologies, and so on. From the time that I started in the HPC environment, I think that we had maybe one quarter, this was at the NIH, and we served the intramural NIH community.
When I started there, I think, we had about one quarter of the scientists who were intramural on the campus who were registered to use the high performance computing cluster (the HPC systems). By the time I ended there, I guess somewhere in the neighborhood of about six years from start to finish, we had three quarters of the scientists on campus who were registered to use the HPC system. It used to be when I started there, we had some staff members who would teach Bash basics. They had a class that was basically like, "Bash and let's come in, figure out how to manipulate files, write to files, and do stuff like that within, at the command line so that you can use the HPC systems a little bit.
They ended up pre-recording that class and putting it up on the website because it got to the point where it seemed like almost everybody that they came in to train already had some fundamental experience. That was a shift just within the six years that I was there. I think that you can see that trajectory from "what? I have to actually use a computer?" All the way up to like, "oh yeah, everybody who comes in here already knows how to access basic commands, file management, and stuff like that at the Shell and has some programming experience." I think that is probably the direction in which the community is heading. You know?
Zane Hamilton:
So on that, David, this brings up a question for me. Thank you, Scott. Glad you enjoyed the meeting/webinar. Call it whatever you want. We appreciate it.
Has the Availability of Linux Contributed to People's Prior Knowledge? [18:20]
Dave the point you made. People come in and they already know things. I wonder if you can attribute some of that to the readily available or readily available versions of Linux. Being able to go install Linux on a very inexpensive piece of hardware or on an old laptop. Since that has become something that has been easier and easier to do as we get further away from the beginning of Linux. Do you think that contributed to that? People just have the ability because it's cheap and it's free. You can go do it on any old hardware. So maybe they show up and they have a different level?
Dave Godlove:
Once again, from my own personal standpoint, that definitely helped me along and yeah. I mean, I think that probably Linux is becoming more and more mainstream. People are using it more and more at home. They have got some old hardware and they want to be able to use that hardware for something. So they install Linux on it. Definitely I think that is helping. I think that a lot of people probably get their first entry to the command line and use Shell probably just from using Mac. Probably the Windows Subsystem for Linux is also contributing to that as well. Giving people even who don't have their own Linux system, some ability to do some basic file manipulation or some basics, like SSH, that kind of stuff at the command line.
That is probably giving exposure to people. I think that the bottom line is that users are starting off being more educated, knowing more about, not just about how to use their computer and how to program, but like the ecosystem, like GitHub, GitLab, and CI/CD pipelines. What those are, what to use them for, maybe some experience with them. That kind of stuff. You're seeing people coming in with a higher level of technical familiarity, I believe, than what we used to have just even five or six years ago.
Zane Hamilton:
That's great. Thank you. That's Alan. Thank you for joining.
Alan Sill:
Yeah, sorry I was a little bit late. I had Slack issues on a new computer.
Software Perspective Changes in the Last 15 Years [20:42]
Zane Hamilton:
No worries. I will ask you what we started off with. What have you seen from a software perspective has changed over the last 15 years?
Alan Sill:
Well, certainly the big change is the emergence of cloud computing as not only mainstream, but even a beginner is getting started paradigm. Part of my job at the university is to make sure that administrators understand the financial trade offs of different choices. Mercifully this period has passed, but two or three years ago, it was very common for a new administrator to come along and ask, regarding our high performance computing center, "Why don't we just move it all to the cloud?" And you know me, who started a cloud research center over a decade ago saying "yeah, what an idea and having to explain all the reasons." For people getting started, there is no better environment. You can do anything including bringing up clusters, very sophisticated frameworks, Kubernetes frameworks as a newbie. Now a different question is whether people want to do that. Here's my short list of ways to get thrown out of a chemistry professor's office. Start talking to them about their new/your new cloud computing framework, your new compiler options, or stuff like that. They are focused on their work and so the presence of a multitude of options, I think, is a plus and is new over the last 15 years, but not all of the answers.
How Has Modularity Changed Things? [22:40]
Zane Hamilton:
That's great. Thank you, Alan. Keep going down this rabbit hole. If I look, these next two questions are related or one in the same. We talked about it to some degree, but looking at the newer languages like Python and being able to bring in a much easier version of modularity. Is that playing a role in it? Does it make it easier for people to get involved with? Because I know Greg is a big Perl fan. In the past, writing Perl and I'm guilty of this as well, I didn't use as many of the other libraries or other people's code as much as you just went and reinvented the wheel every time. You wrote your own from scratch. Now with things like Python and some of the other newer languages, go pick one of the newer ones. You don't have to do that and it's very easy to go get other code that just works. Is that something that has helped drive people to either, I guess, they can go a lot of different directions. I will stop there. Has modularity changed things?
John Hanks:
I have an opinion about that, but I'm very cynical about it.
Zane Hamilton:
No, absolutely. Share.
John Hanks:
Modularity has made it really easy to write something that does a very complicated task really quickly. It's also made it really easy to put something that is critical to your day-to-day existence in the hands of someone who's going to walk away from supporting it. You have no way to fix it and keep it going. This is. I really think the modularity that Python introduces is a ticking time bomb that is going to come back to bite us at some point in the future. Anybody who has to use and manage Conda probably will not have a good argument against that.
Zane Hamilton:
That is what I wanted to get to. I want to understand. I mean, I know that Alan's got again. I would, I just, I would like everybody's perspective on this cause I think it is interesting. As we go I have other questions that are related to this. So I will go to Glen next.
Glen Otero:
I hadn't thought about modularity being a double-edged sword, so that is very interesting. But I do agree violently with the Conda sprawl and how I thought it was the silver bullet. Now it seems to be just like an anchor around my neck. I have deleted environments when they fail to rebuild and update overnight. It's become a real two-edged sword. I think as you were mentioning that, I thought about like, "yeah, it's all about the driver behind the wheel." We still need better programmers to realize that and know the difference.
John Hanks:
You were right. Conda is a silver bullet. The problem is the gun was pointed at our head.
Glen Otero:
That's right? Yeah. I mean, when it started, I was just starting to be able to bring R into my environments and things like that. I just said, "this is going horribly wrong. Then I started having trouble finding anything in Python on my laptop based on just that alone. It's yeah, I mean. We are going to need an HBC carpentry class just on how to. For Conda.
Alan Sill:
Though I have to say that the Mamba team has been doing good work and speeding it up. I heard through a Twitter friend of mine about Picomamba, which runs, get this, in web assembly in your browser.
Zane Hamilton:
Interesting.
Is Mamba Problem Solving Sustainable? [26:25]
Dave Godlove:
Is that ultimately sustainable though? Because as I understand, the problem of the way in which the dependencies are solved is just like a factorial problem. Which just continues to grow and grow. Mamba, I mean, really all that is doing is it's basically taking this problem, which is going to grow exponentially and just making it run faster because it's written in C instead of Python. I mean, that is a bandaid, but ultimately, it seems like the entire system needs to be reworked. Did I misunderstand that?
Alan Sill:
No, I think you understand it perfectly, but faster is better than slower. Redesign? Well, so there's a related issue. We have been probing dependencies and all the package management that goes around managing them. I think in previous shows we have talked about things like SPAC and EasyBuild, which try to manage dependencies. That's been a big contributor to the success of many projects here in the US. The Exascale Computing Project essentially couldn't work without it. European projects I think have been helped a lot by EasyBuild. Along with that has come the GitHub generation, right? Whether you're not using GitHub, you're using some repository with now the ability to track security dependencies within your code and to scan things. We are in a world now where you can release containers content security scanning, but what it will find is essentially any container that is more than a microsecond old has something you ought to go fix. Are you going to do it? So it. To some degree, we are just moving the sand around on the beach here. Hopefully building some sandcastles along the way.
Reinventing the Wheel [28:30]
Zane Hamilton:
Thank you, Alan. Dave, I know yesterday we were talking and you were telling us a story about reinventing the wheel, why not to, or why you should. It was really an interesting story and I would love for you to share that. If you don't mind.
Dave Godlove:
Sure. I mean, I have got some strong opinions about reinventing the wheel. I
mean, I think that I should probably write a piece at some point, called "In Defense of Reinventing the Wheel" or something like that. Because for well established things, I mean, I understand the basic concept is easy to grasp, right? You don't want to duplicate effort. If somebody else has already done that, then you can get a lot further by leveraging the work that they have already put into it. I have seen in science a lot of times where people have said as justification for what they are doing, "I don't want to reinvent the wheel and take somebody else's code and apply it for something which it was never meant to be applied for." Right? I have actually seen people take my code when I was in that world and when I was writing code. I have been in presentations and stuff where somebody took my software and they applied it to a problem that it was not meant to be applied to.
I have been like, "oh, wait a minute that is not a wheel, that is a blender. You just strapped it to your car and you're running on it. It's kind of going, but it's not doing what it's supposed to and your answer is wrong." You know? I think you want to be really careful about that. I mean, you really want to know, I mean. The bottom line is you still have, you can't just turn off understanding and just say, "this black box is going to do what I expected to do." You have to really know. You still have to have some level of knowledge about what those wheels are that you're doing. That you're not reinventing and that you're using before you go. You make it. You try to apply it to your work.
I mean, I used to say that it also irritates me when people say, "I don't want to reinvent the wheel because it trivializes what people are working on." It's like, we don't program wheels. We program rocket ships that are really complicated and these things need to be re-engineered, refactored, and reworked all the time to make them better. They are not just like silly little wheels that you could program up in an afternoon, but you just choose not to, cause you don't want to waste the effort. These are complicated things that people have thought about for a long time and so don't trivialize it. Learn about it and work on it. Then I think that the story that Zane was talking about is, I used to work in a lab where I studied the ocular motor system. This is the system that moves the eyes, the neural control of that, and the muscle control of that.
We tracked eye movements and then we had software to detect eye movements called saccadic eye movements or saccades. These are called saccade detection algorithms. Everybody was writing their own saccade detection algorithm, which were all loosely based on pretty much the same thing. But we were. There were some differences in them. My boss at the time got upset about this situation and said, "we got to stop. Everybody needs to stop writing their own saccade detection algorithms. We should just write the thing, see, compile it, and throw away the source code and be done with it. Just use that from now on, right? I ended up reading a paper about something called microsaccades which are small, tiny eye movements that happened all the time. Even when you're fixated on an object.
And there they. It had within it some new saccade detection algorithms to be able to more accurately detect microsaccades. I ended up implementing that and ultimately writing a paper about it. If we had followed that, if we had just set in stone, this is the saccade detection algorithm we are using and we had never decided to innovate on it, then I wouldn't have been able to write a paper. We wouldn't have gotten that knowledge in the scientific community from the research that I was able to do there. This is my argument against reinventing the wheel. This is also my. Scientific software is a little bit different. It's always on the cutting edge. It's always doing new and sometimes messy things, right? Because you're an uncharted territory investigating things that nobody's thought about before. That is just the way it is.
Zane Hamilton:
That's great. I love that story. Dave, thank you very much for sharing.
Instead of Fixing Code, Can I Throw More Resource At It? [32:57]
I know we have talked about cloud. Alan, you brought up cloud earlier and I think what I have seen, especially in the enterprise and some of very large applications that have a lot of developers. A part of them going through a CI/CD pipeline and trying to do things very quickly. Things have grown and sprawled to a place where we are now with infrastructure. My question is, do you guys see it as it is easier for me to add infrastructure? Be it cloud or any other type of infrastructure? Because it's gotten to the point where it's almost commoditized, it's cheaper. Instead of fixing the code, I can just throw more resource at it. Is that a problem that we are seeing? Is that something we have enabled? Is that part of what is modularity and everything else is enabling?
Alan Sill:
Ask your CFO. I think some people are learning experimentally the consequences of some of that thinking or quickly running through their budgets. There was an interesting Twitter discussion, which was recently turned into a blog post by one of the people who started the organization 2i2c. Which is international interactive computing collaboration focused on things like Jupyter, Binder and Executable Books. The guy's name is Chris Holdgraf and he just posted a blog summarizing why academic researchers don't use cloud services? He asked Twitter and then turned into a blog post. I could even be a writer with this mechanism. So no, it is actually a very nice thoughtful take on it. I encourage people to look for that.
His blog is at predictablynoisy.com and it's worth the dive in and focuses exactly on this topic. What are the consequences, financial, confusion, cost, and pain involved in using cloud services? My main reason for not spending a lot of time on it in my day job is cost. Bulk computing is still far cheaper on premises, but I find it frustrating because I have this whole research center on cloud use of cloud. We are trying to solve the problems that Chris discusses in this blog post.
Cloud Enabling Bad Code to Exist [35:40]
Zane Hamilton:
It's interesting. Thank you, Alan. Greg, what do you think about cloud enabling bad code to just exist?
Gregory Kurtzer:
Well, first I wanted to go back to that Perl comment you made about me a while ago.
Zane Hamilton:
Hey I'm guilty too.
Jonathon Anderson:
Speaking of bad code.
Zane Hamilton:
I'm guilty.
Alan Sill:
The duck tape of the internet.
Zane Hamilton:
Yeah, it was awful. Hey, Jonathan gets the award for taking the biggest punch.
Gregory Kurtzer:
There was a. There's another point of the double-edged sword, which is anyone who's ever, and sorry to go back a little bit, but anyone who's ever had to battle with CPAN modules. If you remember CPAN modules to get different bits of Perl code to work. When we talk about reproducibility in science and then you start talking about the dependency hell of CPAN modules that is a very difficult problem to overcome. Dave and I were just. We have always gone back and forth on this and it just came up again very recently. Dave had another just tremendous quote on this, a good mic drop moment, where we were trying to rebuild a container or have a container rebuilt. Even though the recipe file didn't change at all last week it worked. This week, it doesn't. That is because the internet is not reproducible. That is Dave's quote and I liked it.
I wanted to throw that in there. When we start talking about all these modularity, features, and whatnot in these software applications. It is definitely a double-edged sword, especially as we are talking about science and reproducible science. It is a reason to containerize in my mind, but you just brought up a really interesting question. Which is, does cloud enable, and actually I'm blanking out in your exact wording on it. What I got out of it was, does cloud enable bad programming and or bit rot in a matter of speaking? Containers absolutely do enable bit rot. That is not necessarily a bad thing especially again as we are talking about reproducibility in science. It is, again, like everything, a double-edged sword. We have to balance that out, right? Sometimes you want to have something that is totally legacy, but 100% guaranteed reproducible from a software stack perspective. If it gave you the wrong answer, you want to repeat that wrong answer. Anyway, that was my take on that. It's not really specific to cloud, but containers is definitely an enabler of cloud. That is how I will tie that together.
Zane Hamilton:
Absolutely and the example that I was drawing from Greg is I worked with a group. Again, they had 400 developers developing one large thing and they were to the point where they were throwing about 3 million errors an hour. Instead of fixing the errors, it was easier to add VMs to the environment. We are going to continue to throw those massive amounts, but it was just easier. It got to the point where cost was cheaper too. I can just keep giving it VM's and add infrastructure than to go back and fix code. I feel like at some point there is that scale of I'm not going to go back and fix it. I'm just going to keep throwing hardware at it. It's just easier.
Gregory Kurtzer:
That is not specific to cloud. There's lots of situations. Which in high performance computing, instead of optimizing your software and profiling the software. It's easier just to buy more nodes and just run it bigger or run more instances of it. Break it up smaller. There's been a number of situations that I can think of where optimizing the code was not even one of the top five priorities in terms of how they are going to make their jobs and workloads run faster.
Alan Sill:
Right? I think priorities is the right word and here you have to. I think we have a chance to illuminate the different stakeholder communities in software performance. If the topic is writing code tools for writing code, we have to recognize that many HPC users, many cluster users, and cloud systems users are domain scientists or business users who essentially aren't programmers. That would be their answer typically. The difference being that with an on-prem system, you're much more likely to have an assistant man contact you and say, "what the hell are you doing to my file system?" Whereas as I mentioned earlier, it'll be your CFO or contacting you if it's in the cloud. The question then I think is what/who are the communities interested in software development, performance characteristics. We have the various C++ committees developing ever more advanced versions of that. Even GCC advancing regularly through versions. How do you connect to that community with making these things of use to users?
Generalist Developers vs Specialized Developers [41:11]
Zane Hamilton:
Absolutely. And Glen, I know we talked a little bit, yesterday. You brought up the topic of like generalist developers versus a very specialized developer and the differences thereof. Can you kind of dive into that a little bit?
Glen Otero:
Yeah. So one of the trends I have noticed over the last 15-20 years is that as the main scientists enter. Started programming and using whatever language they want, whether Perl, Python, whatnot. Over the last 10 years, particularly five years, research institutes and in some businesses as well, they've actually hired professional programmers to help the researchers write better code. The researchers could say, "hey, I have this problem. I'm trying to do this type of analysis. I'm doing it this way. Can you help me? I can't get it to work. Can you? I'm spending too much time on it. I'm supposed to be doing this stuff in the wet lab and science is what they pay me for. So help me program this better."
Having teams of those professionals in-house as essentially a core next to typically associated with in the bio-side with the sequencing lab, right? Wherever they are generating all the data. I have seen that as a more popular trend. There are a couple things. One is funding it. If you're a research university, right? Programmers cost money. They are not free like grad students. They could also. They are retaining that type of talent. It's hard to get HPC or bioinformaticians or really good programmers because they are. You have got the unicorns of the world, the Googles and the Ubers that will pay a lot more money than research universities will.
So it's. I have. I think it's a trend that is going to continue because people are going to want their researchers getting more done. It also speaks to that reinvention of the wheel. If they are going to reinvent the wheel, let the programmer do it, not the domain scientist. Yeah, that is just a trend I have seen. If you can do it, great. If you can't, I think that is one of the reasons why the top tier institutes, whether it's the Hodge, Harvard, the Broad, Washington University, keep staying ahead because they will hire actual programmers to do a lot of the heavy lifting for the researchers.
Dedicated Programmers at Universities [44:16]
Zane Hamilton:
Interesting. Is anybody else seeing that? Those dedicated researchers that are there to help?
Gary Jung:
Yeah. We have, at Berkeley Lab. Larger projects. One of the earlier questions was about whether we see people making use of CI/CD pipelines. I see it more where the research projects engage with somebody. A group that can provide software engineering. Software engineering is a way of working. We have taken people who are domain specialists, did programming, research and were really good at it. Then there's still a way of working and thinking about software. Maybe the word I'm thinking of is like sustainability and training these people so that this way you have sustainable code. It's kind of a big thing because of turnover and people find out that software engineers can do things like help people look at the code and decide, "okay, if I want to make it do something different, do I hack on it and fork it or is there a better way to do this so that it can take into account upgrades down a road?" I mean, there's a bunch of things that aren't readily apparent, but will/can cost you a lot later. We are starting to see more software engineering and recognizing it as a structure that could pay off.
Zane Hamilton:
That's an interesting topic. Right? I don't know. John, if you had something you wanted to add to that before I ask Gary another question?
John Hanks:
Yeah. This is. My current environment does a really good job at bio. We do a really good job of having developers that help the domain scientists. There's a lot of overlap between development groups and the science groups where the expertise gets blurred in there. The developers do a really good job of doing what the scientists need. I wanted to add though that in my previous job, which was my one foray into the real world. Into an actual company and not an academic research institution, I saw the exact opposite thing. It was politically powerful groups of developers thinking they were driving the science and it didn't work out well. If you think about it, that is never going to work out well because if you do HPC long enough and you do it well, you will eventually have somebody come to you and say, "I need you to do this because the IT org is awful. They can't do it, won't do it, or whatever."
When you put the developers in charge of the science, what you're doing is putting the IT org in charge of the science. You're letting the tail wag the dog. That is never a good thing. I don't know how to avoid it. I have lost two great jobs now and had to move on because the politically powerful IT org came and crushed me. If somebody knows how to avoid that, it would be fantastic to let me in on that. I just throw it out there as something that happens. It's always bad in my experience when it does happen.
Zane Hamilton:
So this will apply to you too then, John. The question I was going to ask Gary is, as you have research scientists come in and work with a developer, are things like doing paired programming where you're actually pairing a research scientist to a developer so they can work together? Is that something we are seeing? I mean, that has gotten really popular. Especially in Agile and CI/CD as paired programming. Is that something that could help in this instance as well?
Gary Jung:
Yeah. We. I think that is. The idea is a partnership. We are thinking because we are in an academic setting, we are thinking smaller and an extension of the project group as opposed to a large IT organization.
Dave Godlove:
I would like to jump in real quick.
Zane Hamilton:
Yeah. Absolutely, Dave.
Dave Godlove:
Sorry. I would like to jump in real quick and say something in response too. I mean, along the same lines. John described a situation in which the developers were driving the project and were moving the science in one direction. I have seen the opposite and the opposite is that I used to work in a lab where we had hired a programmer. What you would do is if you didn't have time, you would take your problem to him and you would say, "okay, I need data analyzed in x, y, and z way." The problem was part of programming is sitting down, breaking your problem into real steps, and creating an algorithm, right? That is part of programming. A lot of these folks didn't have this background to be able to do that.
So they would go tell the programmer, "I need x, y, and z," and he would program up x, y, and z. He would take it back to them and they would say, "oh, that does exactly what I asked you to do, but it's not actually what I wanted." Because he didn't have that context. He didn't have that scientific context to be like, "oh, well I you said this, but really what you meant was that." So that is a situation in which the scientists are driving. It's the opposite. The scientists are driving too hard, the project, in a vacuum not really understanding what the real necessities are from the programming standpoint. I totally agree with you, Gary. That it needs to be a partnership. The programmers, if that is their background, need to be comfortable with and open to learning some science and to having one foot in that world a little bit. Have to have the capability to be able to do that. On the other hand, the scientists really need to meet the programmers halfway and need to be open to learning a little bit about computer science, understanding a little bit about software engineering, and so on.
John Hanks:
It definitely works best when it's a collaboration and not somebody issuing orders and instructions to the other party that is just going to blindly follow those orders and instructions.
Alan Sill:
Right? So we have the most success here when we actually present people with information. It's sort of like the rule for emails. If you're asking for help, provide information. Don't just dump a question on someone. We will approach researchers with output of our monitoring tools that say, "hey, it looks to me like you're hammering the file system pretty hard here, but then have large periods of inactivity. Can we rebalance that?" So we actually have a very extensively developed monitoring system here at my center that is a research project. Some of it's being done with industry funds to improve it. Some of it made it into the Omnia framework from Dell, for example. It's beautiful graphics, which is what pulls people in. What really makes it helpful is when you can say, "hey look, this is what your code is doing."
We are working right now and this stuff is not ready for primetime yet. We are working on actually extending this instrumentation into the MPI layer of the code for some of our advanced codes. We can go to people and say, "here's something that will help." It's an old habit of mine from way back in post-doc days when I had the job of switching people from Fujitsu FACOM to VAX. I couldn't get anyone to come to my VAX software conversion meetings until I started showing up with what would we would now call "pull requests" saying, "look, here's some code. It works, it's faster. Oh, it also happens to be VAX compatible." That is how you get people to change. You go to say, "hey look, this works. It's better than what you're doing now."
The Software Engineering Movement [52:27]
Now how do we get such people? I wanted to mention the emergence of the whole research software engineering movement, which started in the UK over a decade ago. Now it's become quite a mature thing. There's an organization in the US working really fast and hard to bring up equivalent things here. It's us-rsc.org. And I. If you have never encountered that, it's a great place to send people who are thinking of careers in this area or are looking for resources. These are people who are in the research software engineering area and are trying to build a professional society and organization. They have conferences. I think they are just having a conference now and tutorials. All the benefits of a community. So that is where you get. I think people in the long run should treat it as a professional part of what we do and not something we squeeze into the corner somehow.
Software Development vs Scientific Software Development [53:30]
Zane Hamilton:
That's great. Thank you, Alan. I think one of the themes that I have heard throughout this is general software development is not the same thing as scientific software development. They are a little bit different or a lot different. Not just because you're good at one doesn't mean you're good at the other. Would that be a true statement?
John Hanks:
Yeah. I think something that Dave said earlier that the. I have recommended this to undergraduates as he has, as well as he mentioned, every science undergraduate that I have ever talked to, I have told them, "double major in computer science." You want to do that because if you are an average biologist, you're just an average biologist. If you're an average computer scientist, you're an average computer scientist, but if you're an average biologist and average computer scientist, you're a unicorn. You basically name your salary at that point. The same thing goes for computer science, chemistry, computer science, physics, right? When you cross those two things together, that jumps somebody way up on the list of applicants for any position in either one of those areas.
Zane Hamilton:
Great. David, you were starting.
Dave Godlove:
Well just to hit on that. Yeah. Like as John was saying, when I went and interviewed for grad school they did these crazy interviews where you would go and interview for like two days at one place. You would do one hour for 8 to 10 hours a day with each of the PIs. So you would meet everybody in the department and talk to them about the potential of joining their lab or whatever. I would get questions like, "are you familiar with MATLAB? Or have you done any programming? Have you done it?" My answer to that was "no." In some cases, that was the end of the conversation. People were like, "okay, well nice weather we are having today. We have got about 45 minutes to fill. I'm going to grade papers or whatever. Maybe you can just kind of sit there."
It's really important to have that background if you want to be serious about research computing. But yeah. Back to your original question, scientific software development and other types of software development are different in that the, and I know we wish that they weren't, and sometimes we act like they shouldn't be, but they are. That is because the goals of the developers are different. If you are a software developer who's developing a tool that you want to sell or you just want, it's open source. You want a lot of people to use it or whatever. You're going to worry about things like packaging and whether or not somebody else is able. Like multiple different systems and different environments and stuff like that. If you're a scientist who's looking to analyze some data, maybe you should worry about that stuff. But you're not. You're worried about getting your data analyzed on your desktop computer or in your HPC environment and getting those numbers that you need to put into your paper and then being done with it and moving on to the next thing. Honestly, that is what you're worried about.
Alan Sill:
Yeah and that is why we have jobs. This is helping such people and to bring it a little closer to home for CIQ. Of course supporting the ecosystem that allows migration and updating of operating systems without breaking your application base. That is where the hard work is from my point of view. That is a core piece of what we have to do as we migrate these applications from ancient, I won't miss the chance to say pro code, to something more modern.
What is Next For Software? [57:20]
Zane Hamilton:
Great. Thank you. We are coming up on time and I want to make sure everybody gets to have a last word. As usual, I have a last question to go around with. What do we see as next for software? I'm going to start at the top with Gary.
Gary Jung:
Boy, that's an interesting question. One of the things that was. Someone just said, I think it was, the people researchers just want to get their code to work. We are starting to see that more. I. There was a series of discussions here at the laboratory about the future of science and scientific labs. I mean, people are talking about things like unmanned wet labs where robotics actually do all the experiments. That is like, what's in the future? What we are hearing from a lot of people that work with scientists is that they are not going to want to know all this implementation detail. It's just too much. I mean, they are. We are at a point where people are more educated than ever before and using the systems. Now it seems like that is not what they want to do. That they want to go back to something where they don't have to think about all that stuff.
Zane Hamilton:
Great. Thank you, Gary. Dave, what do you think's next?
Dave Godlove:
Well, that is a good question. I think that I don't really know. Certainly containerization and some of the tools that we are seeing come out are huge game changers. We are seeing a lot right now of the way that scientists develop and distribute code changing. I think at a more basic level the thing that I just said about scientists wanting to just get their good to run and that is it. And aren't concerned about some of the other best practices and things that are important from a software engineering standpoint come from working either in isolation or in very small teams.
When you see scientific fields begin to progress beyond what individual researchers or small teams of researchers can do and start to require bigger groups of researchers. I think that is when you're going to see researchers really start to adopt some of these procedures from software engineering, which are necessary. Because that is a lot of what software engineering is. It's not writing code. It's figuring out how to write code as part of a team and with a big group of other people and how to ultimately make sure that that code can run outside of your team with larger groups of people. So I think that is the. It's like a social change. Which needs to happen in order to see better programming methods take off.
Zane Hamilton:
Thank you, Dave. Jonathan.
Jonathon Anderson:
So I won't make too strong of a prediction, but what I imagine for what comes next is that we have a generation of new people in the field who will come with an expectation of having functionally infinite and elastic computing available to them wherever they are. I think that what we are going to see, probably everywhere, we are already seeing in a lot of places. What may affect research computing in the future is that expectation getting carried forward that I don't want to have to be sitting at my terminal writing code for the HPC system to be able to get results out of it. I have seen a little bit of that already with. I have worked with a few projects that were wanting to do edge computing with drone based projects and sensors all over tarnation.
That kind of accessibility, one of the strangest things that I would or what felt strange to me when we were setting up our Jupyter environment was people wanting to be able to access it from their phone and even more so a tablet. I think greater accessibility to the compute resources that we have is going to. Or a desire for that is going to impact the kinds of computing systems that end up getting built. Which impacts the complexity and the need for that background in computing and programming that we are just going to start assuming everyone will have.
Zane Hamilton:
I think it's a good assumption, Jonathan. Thank you, John.
John Hanks:
Yeah. If I look at just developers in general. Web developers, everybody out there not specifically picking on research developers. I think at least like 50, 60, 70% of these people should be forbidden to ever touch a keyboard again. Like this is. They are not helping. A lot of what comes out of this is just busy work to justify more developers and this weird economy that we have which seems to refuse to take a downturn. We have this bubble where people just keep hiring developers and paying them more and more to turn out basically nothing except cycle burning stuff that converts electricity into heat. I don't. I'm really pessimistic about the software world getting better at this point. It just seems to be getting more arbitrarily complex, layered, and more interdependent. Short of a huge recession. I think we are headed for very rough times ahead.
Zane Hamilton:
Okay. Very expensive software. Yeah. Thank you John. Glen?
Glen Otero:
I predict that universities will have to start having programming. Like scientific programming prereqs for some of their majors, right? So it's a wish, but I think it's going to have to happen because all. One thing I didn't mention is as HPC has grown, it used to be the domain of physicists, right? Or the astrophysicists, just the hard sciences. Well, biology has become so much more of a hard science now with all the data you mentioned. Archeology, right? Is data that were or sciences that were considered soft now become hard, just have this influx of scientists. I think more and more there's going to be. There's going to be a need for people to be able to program as a baseline and not just like a nice to have for your career. That is my prediction. I hope they just won't turn electricity in the heat.
Zane Hamilton:
It's a good wish.
Glen Otero:
Yeah.
Zane Hamilton:
Thank you, Glen.
Glen Otero:
Don't tell me I'm an optimist.
Zane Hamilton:
I will. Alan?
Alan Sill:
Yeah, so it's really hard to make a single prediction. I see expansion in interactive, the web assembly direction, as we see that the whole ecosystem becomes more powerful. You can run Fortran in your browser and web assembly. As I mentioned Picomamba is being worked on. I think that direction will continue. At the same time, I believe, as I mentioned, the beginning of our dis. When I joined the discussion that the ability for people to self-educate on extremely sophisticated infrastructures that used to be the domain of just experts is going to grow you. You can already bring up Kubernetes clusters with a dizzying variety of single push button environments. That is going to become the norm. You're going to see that average folks will not bother stopping at the HPCC help desk virtually or in person. They are just going to start composing these environments. The tools to do so will become more sophisticated partly through work people on this call. I will stop there.
Zane Hamilton:
Thank you, Alan. Greg, everything's next.
Gregory Kurtzer:
So I'm really torn on this. John's comment is right on, right? Software.
Alan Sill:
Might have had to reconnect.
Gregory Kurtzer:
Software tends to grow in complexity considerably.
Alan Sill:
You're just going to have to read his mind, Zane.
Zane Hamilton:
What he was apparently.
Gregory Kurtzer:
Is it not working?
Zane Hamilton:
No, I can hear you. I hear you.
Gregory Kurtzer:
Okay, sir. Basically it's growing in complexity in terms of layers and whatnot. We are trading performance with a number of different other facets like portability, cloud native, and whatnot. I think that is a trend that is going to continue. I don't think it's going to stop. John, I definitely feel your pain. I share it, but at the same time I think where we need to be focused and one of the things that we spent a lot of time on was turn-key HPC. As we start thinking about turn-key HPC to really satisfy the needs of these broader use cases and different types of use cases. We have to be thinking about what that looks like? That is the area where I can add value or where I try to add value is really thinking about that infrastructure that we need to be building and considering in terms of, yeah, in terms of supporting those sorts of use cases. Whether they are going the right way or not. Jonathan wrote a message which I couldn't stop laughing at for a little bit, which is, it's software defined heating. Even if that is the case we still have to. It's our jobs to build that infrastructure and to support those use cases as best as we can and give recommendations on how to optimize. It's always cheaper in my experience to buy more nodes than to buy more people.
Zane Hamilton:
Absolutely. Well guys, we are past time. I really appreciate you guys coming today and having this conversation. I think it's always fascinating to talk to people in different parts of this world. I appreciate it. Thank you guys for joining. Thanks for watching online. Join us next week. Go ahead, like, and subscribe. We will see you then. Thank you very much.