Risk Insights Logo
The Assurance Show

Episode 12 | Open Source software for internal audit and performance audit

May 18, 2020
Episode Summary


In this episode we discuss the use of open source software for internal audit and performance audit.

We bust 3 common open source myths:

  1. Security - that open source is less secure
  2. Skills - that open source software has smaller talent pools
  3. Quality - that open source has lower quality.

Link: Blog: 3 open source myths that might be inhibiting your team's progress

Episode Transcript

Conor:   0:27
Okay, Today's topic. We're gonna be discussing the use of open source software as part of our auditing tool kit. We'll look at some that are particularly useful. But as we go along, we might also bust a few myths about the use of these tools along the way. But maybe if we could just start of Yusuf, why is this topic of open source technology on and open source Analytical tools important to auditors?

Yusuf:   0:51
when we talk about open source. First thing is, we're talking about different types of software for which, because they're not always free, right. But the different types of software for which you are able to get access to the source code. So open source means that the source code is accessible. So if you think about proprietary tools, they don't allow you to dig into the underlying programming language. Where as open source tools when they make the software available. They also make the code that is used to create that software available as well. Where is

Conor:   1:23
the ability to see that source code important for auditors?

Yusuf:   1:26
There's various reasons that we want to use open source tools. One of them is open. Source tools are not always free, but a lot of open source tools are free. And so if we starting to use analytics, or even if we have been using analytics for a while, if there's a more cost effective alternate to proprietary software so paid software, it just makes sense to use it. The second thing is that when you have proprietary software and we'll explore this when we talk about the different myths, quite often there is a relationship between the level of openness and the level of quality. And that's because the fact that the source code is open means that there's more scrutiny over it. The third thing is, when you look at some of the more useful tools and programming languages, a lot of them are in the open domain. So things like the programming language Python is in the open domain. The tool that we use quite regularly, KNIME, is in the open domain, and they are few others like that where you can actually get into the source code. If you've decided that you're not going to be using open source because it's quote unquote risky, then you missing out on a range of tools that could be really beneficial for use.

Conor:   2:37
Quite often in our work, we hear a lot of feedback or issues around three things in particular that cause barriers for the adoption of open source Tools, the 1st one being security. So people think that open source is not secured, the 2nd one being the skills factor. So some auditors consider that, you know it's a bit more difficult to use, say open source tools than others and the 3rd one Being quality, the base code or source code isn't as good as some of the other proprietary tools

Yusuf:   3:05
The open source Ecosystem, particularly as it relates to data and analytics tools, is massive, right? There are hundreds of open source tools available all off those tools have their code available for everybody to view, and because that code is available to view, quite often, people see that as a potential security risk. What's even more apparently risky or what creates more of a perception of risk is the misconception that open source means that everything is available and everything that you do will be transmitted to the Internet, right? So most of most people that will listen to the show will be a bit more savvy than that. But the reality is that quite often we are dealing with not just our own teams, but we're also dealing with IT teams and marketing teams and other teams that we have to talk to, to get through the approval process for using some of those open source tools, either when we installing them or when we explaining to our stakeholders that we're going to be using them. And quite often the question that comes up is this because it's open source is my data all going to be available to view and is my data still private. The reality is that there's no connection between an open source tool and online, tool. They are different things. You can get an online tool that is open source. You can get an open source tool that is continuously connected to the Internet, and that is transmitting data. But that's not what open source means. So you can have that with both open source tools and proprietary tools. You can have that that risk, if you like. Where data is actually transmitted to external parties. Open source code is available for everybody to view. That really is the only difference between an open source piece of software and the proprietary piece of software is that you can. You can see the code. Now back to the first thing that we mentioned, you know, because the code is open and available for everybody to view. Does that mean that you have a high risk of attack? That's not the case. Firstly, because you have more developers looking at the code. They usually identify vulnerabilities in the code much faster than you would with proprietary software, so proprietary software usually has a small developer community. Open source software has a very large developer communities. The Second thing is that there's a much lower potential for back door risk. Sometimes you think ohh what if my data can be transmitted from this open source tool to third party. You can programme a backdoor into open source software, but that would be available for people to find because they can see all of that code. So if somebody did programme a back door into that software, they would be putting themselves at risk of everybody being able to look at that code and find the back door that's been programmed. The risk is actually higher, if you, so if you think about programming a backdoor into software, you would want to do that in closed source software, because people then don't have access to the source and can't see that backdoor

Conor:   5:56
on that basis, it seems to me that it would be really useful for auditors to be able to discuss with their IT teams the distinction between, for example, open source tools to help them do their jobs versus online tools. So there's really no misconception about the risks that probably do not exist in reality about those open source tools,

Yusuf:   6:17
As an auditor whether you decide you're going to be using any open source software. It's important to understand this myth that open source is not secure, both from a perspective of using it yourself, but also in terms of evaluating other software that's used across the organisation. So you don't want to be going into an audit and finding a piece of open source software and saying, Oh, is this secure how can it be secure if it's open source. I mean, we've had situations like that and really, you end up with egg on your face. So Understand that risk or that opportunity if you like both for your own use, but also where you're going to be looking at open source software that's used by other parts of the organisation as part of your audits

Conor:   6:53
Okay, so myth number one, that open source is not secure and creates risks for the organisation, we just busted that myth. Myth number two Yusuf another one that we consistently hear on our travels is one relating to skills, that It's harder to come by people with specific open source software skills.

Yusuf:   7:13
So the myth is that skills are harder to come by than with some of the older software that people use. That's partially true. So, yes, there are some situations where open source software is newer, and because it's newer, it hasn't had as significant adoption as some other tools. The reality is that those skills are much easier to acquire. So if you are bringing new people into the team, if you're bringing particularly graduates into the team or people who have been working for just a few years, they would largely be in the universities et cetera, exposed to some of these open source pieces of software. And if they haven't and for the rest of your team that may have studied, you know, when the software wasn't available yet or have been going through their training with something, this sort of software wasn't available yet or that were just trained on proprietary software, For whatever reason, then in the audit world just to explain. In the audit world, there's a few proprietary pieces of analytic software. These they're really old, and they don't really do, you know do the job anymore. But they've been widely used because a lot of the people that have been coming into audit teams have come from some of the bigger firms where those that software is used for ordered analytics. Now, the reality is that the software that is open source because it's open source, there's broad user communities. There's lots of free resource, they have logs and forums and videos and training courses and people that you can talk to. It's far easier to learn the software, is open source. So if you really want to go in and understand what the software is doing, you can do that. You can debug the code if you like. If you really want to get down into the details, you, can debug, some of that code and, you know, see you exactly what's going on. You can't do that with proprietary code. The situation, like I said, when you bring in graduates in the students in, is that the adoption rates are quite high amongst university students for open source software, because there's no barrier to entry for that, the software is free, they can download it and they can inspect the code and they can work out how to use it. And like I said, there's a broad there's broad communities to help them with that as well. So overall, that level of skill is sometimes in some. For some software that's newer for certain jurisdictions, it may be that the skills are more difficult to come by than with some of the older software but overcoming that skill barrier, it's not an insurmountable task, and quite often you will find a larger number of people with skills in the open source software just because they've been able to get access to it. If you use open source software, your total cost of ownership could potentially be lower. So your training costs, which is usually a large cost of using software and training, being formal training and ongoing training is lower. They are easier to use. They are easier to find solutions for, and so that means that your overall total so your total cost of ownership for that software is potentially lower than it would be with your proprietary software.

Conor:   9:58
Myth number two. That skills are harder to come by for using open source software than the older software, we've just debunked that. So the third myth of the three we want to discuss today is in relation to the quality of the open source software out there. So the myth being that you know, because it's open source and quite often as you've described Yusuf, it may even be free or lower cost that, in effect, that means that you get what you pay for so that it's lower quality code than some of the products that you do have to pay for

Yusuf:   10:30
The level of quality is not directly related to the level of openness or closeness, if you like of source code. It all depends on how well that particular piece of software has been adopted, whether it's open or closed, because what happens is that the level of quality of code is dependent on how many programmers you have looking at it and what the level of quality assurance is over that over the code that that underpins that software, whether it's open or proprietary. With open source software, you have large active communities, particularly the ones that you're going to be looking at. So I wouldn't recommend going to, you know, fringe pieces of software that one or two people have looked at or used, because then you're in a situation where you just are getting potential rescue software, right? I'm talking about the better understood, better known and better adopted open source software. You then have very large, active communities where they identify and fix errors fairly quickly. A lot of the software is actually provided by community contributors, as they call them, and some of these are corporates and some of them are individuals. Those individuals and corporates are usually very passionate about their work, and that's where they do their work for free. Or they have some sort of incentive, you know, they paid to do they work in some way. What happens is that the work that they do isn't directly incorporated into the software. You usually have a central coordination team and that that could either be in the case of open source software that is provided by proprietary organisations. They would then coordinate that activity centrally so that what that means is that they would check the code and make sure that you know it would pass any sort of standards that they would have developed for implementation of that software. If the open source software is developed by a not for profit group, you would have a similar situation and they would then have individuals that are brought together as technical QA teams to check that community contributions are providing a reasonable level of quality of code. And then to contrast that with proprietary software, software where you can't see the underlying code. So with closed source software, you really are limited by how many developers they have to develop that software internally because that's the source code isn't available for the broader public broader community to see, you then limited by what they are able to do within their teams. So if they are small developers of software, you have, ah, potentially higher risk of lower quality. So overall, I would say that for the better adopted open source software for the ones that are well used and understood, they would potentially have a higher level of quality of code, then closed source software. So in, like, you know, we said, a lot of open source software is free, and what you then end up with is that a lot of free software Open source software potentially has a higher level of quality in the code, then close so software that you could have paid thousands of dollars for

Conor:   13:24
because you've got more eyes on it, so to speak, from its users around the world who have a significant interest in continually improving that code.

Yusuf:   13:33
Yep, you have a lot more people looking at it because they can.

Conor:   13:37
Okay, so that was myth number three. And that myth was around people considering that because some open source software is open source that it has lower quality than some of the software tools that you have to pay for. So that myth has also been busted. So they were probably the three most consistently heard myths we come across,  and over the years, we've obviously used various open source and proprietary tools to perform some of our analysis during assurance engagements. But we have landed on some open source software that basically ticks all of our boxes in terms of the assurance work that we do and some of the, particularly the more sophisticated data analysis that we do. And that's that's software called KNIME. So having busted those myths, some of the attributes that we consider KNIME to be particularly strong in, would you mind taking us through a few of those, maybe?

Yusuf:   14:32
Obviously, a lot of people use Python for analytics work. Python is an open source programming language. It's a fantastic language. R is the other open source programming language. R is a free and open source programming language that a lot of people use. Both R and Python are widely adopted for the use of analytics. The difficulty with using programming languages is they often have a steep learning curve, and you are doing, you know, thousands of lines of code to get to a solution. The software that you mentioned, KNIME, is open source, and it's free to download. And they have quite strong community contributions and community input forums, Blogs training, etcetera. That piece of open source software has worked exceptionally well for us. We don't have to go in and we can code in Python or R if we like, because they've actually got integrations with Python and R  where, you know, for lack of a better phrase, where you can actually go in and and write your Python code or write your R code within a KNIME workflow, as they call it. But because KNIME is graphical user interface, with a couple of 1000 different modules that you can plug and play to create the work flows to enable the analysis that you're doing, we found it to be super useful, and we use it exclusively, So that's the open source tool that we use for analytics. And then obviously there's a range of other tools that will use for visualisation and the like.

Conor:   15:52
And that's KNIME K, N, I M, E silent K

Yusuf:   15:56
developed out of the university down in Konstanz in Germany and they now have headquarters in Berlin, in Zurich. And they've got a few offices in a few other places. They do sell proprietary software. So a disclaimer, they do sell server software as well. That's the proprietary version. That is, that really extends the KNIME analytics platform to enable online collaboration and to enable scheduling of server jobs and the like. So analytics workflows to run on a frequent basis. That software relies on the KNIME analytics platform underneath it, and that KNIME analytics platform is completely open source and free. So whether you use the server or don't, what that means is whether you pay for it or don't, you get the same level of functionality in terms of the analytics functions and KNIME analytics platform, The underlying platform is open source. It is free. You can download it. Use all of the modules that are available. In fact, there is no proprietary KNIME analytics platform. You can actually get a few modules that certain third parties charge for, but KNIME themselves don't charge for any of the modules. What they do charge for is the server overall. And the disclaimer. And the reason for this disclaimer is that we are a KNIME partner, and so we do, we are able to distribute the server software, however, what we talking about in terms of the open source software that we use. That's the KNIME analytics platform that is free and open source, and that we use as free and open source for all of our projects on audits

Conor:   17:23
So as a less technical person, because it's got that graphical interface front end is that, you can really see the logic of how the analysis comes together, so this really helps to understand an end to end piece of analysis when you can actually see the logic that's happening on the screen in front of you

Yusuf:   17:39
Easier, a lot easier to review as well. So when you try to review that, so you talked about the you know, the graphical user interface and easier to see what's going on, but also easier to review. Obviously, there's a range of other analytics software platforms out there that have a graphical user interface. There aren't that many or I don't think we haven't come across any, that have similar levels of functionality and that are also free. So a lot of them, you know, you pay thousands of dollars for

Conor:   18:04
Discussion today around some of the open source software that's available to Assurance Professionals. We busted three myths along the way. 1st one being about open source not being secure certainly busted that one, the 2nd one, It's more difficult to find people with the skills to use open source software. Also busted. And the 3rd one being the myth around that, being less quality with the use of certain open source software.

Listen to More Episodes Like This

Conor McGarrity
Podcast host

Conor McGarrity

An authority on data-focused audits, Conor is an author, podcaster, and senior risk consultant with two decades experience, including leadership positions in several statutory bodies. He’s driven to help auditors uncover new insights from their data that help them to improve organisational performance.
Yusuf Moolla
Podcast host

Yusuf Moolla

Fellow podcaster, author, and senior risk consultant, Yusuf helps performance auditors and internal auditors confidently use data for more effective, better quality audits. A global leader in data-focused auditing and assurance, Yusuf is passionate about demystifying the use of data and communicating insights in plain language.