12 Must-know Open-Source Projects on Artificial Intelligence (As a Competitive Software Engineer)

post-title

In today's fast-evolving tech landscape, staying up to date with the latest developments in artificial intelligence (AI) is crucial for software engineers. AI has become integral to numerous industries, from healthcare to finance and beyond. One of the best ways to keep your AI skills sharp is by getting involved in open source projects. In this blog post, we'll explore some open source projects that every software engineer should consider to stay current with AI trends and innovations.

1. Llama 2

It is an open source large language model provided by Microsoft and Meta. If you want to build something like ChatGPT without using OpenAI's API, then llama2 is one of the best options for you. It is free for research and commercial use.

Links:

https://github.com/facebookresearch/llama
https://github.com/ggerganov/llama.cpp
LLaMA2 with LangChain by Sam Witteveen
Llama2 in LangChain by James Briggs
How To Install LLaMA 2 Locally with TextGen WebUI by Matthew Berman

2. GPT Engineer

"GPT Engineer is made to be easy to adapt, extend, and make your agent learn how you want your code to look. It generates an entire codebase based on a prompt"(AntonOsika, 2023). Ideally if we give proper instruction, it can be used to write a whole software. Personally, I think this will help reduce time spent on typing for a Software Engineer but I think it is not perfect yet.

Links:

https://github.com/AntonOsika/gpt-engineer
How To Install GPT-Engineer by Matthew Berman
GPT Engineer by Dave Ebbelaar

3. Auto-GPT

Auto-GPT moves the AI industry one step closer to true artificial general intelligence. Auto-GPT is based on the language model GPT-4 and can be used in the same ways as chat bots like ChatGPT, Bard. But it automates tasks to achieve goals faster. When a user inputs a prompt, Auto-GPT will create sub-tasks for itself to process further until ultimate goal is reached. It also has features like Internet Access and Short Term Memory. Personally, I think this one is also not perfect yet.

Links:

https://github.com/Significant-Gravitas/Auto-GPT
How To Use AutoGPT by Howfinity
Install Auto-GPT Locally by Matt Wolfe

4. MetaGPT

MetaGPT takes a one line requirement as input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc. Internally, MetaGPT includes product managers / architects / project managers / engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.

Links:

https://github.com/geekan/MetaGPT
How To Install MetaGPT by Matthew Berman
MetaGPT Setup by Prompt Engineering

5. Stable Diffusion

Stable Diffusion is a text-to-image diffusion model and it is like the opensource alternative for Midjourney and DallE. The model behind it is relatively lightweight and runs on a GPU with at least 10GB VRAM. Personally, I think it is very good except when you generate faces with mouth open, the teeth are messed up.

Links:

https://github.com/CompVis/stable-diffusion
https://github.com/AUTOMATIC1111/stable-diffusion-webui
https://github.com/lllyasviel/Fooocus
Stable Diffusion Crash Course by Freecodecamp.org
Advanced Stable Diffusion Features in Fooocus by HolidayEffects

6. YOLOv8

Even though generative models have been hot news, Artificial Intelligence is not just about Generation. YOLO, which stands for "You Only Look Once," is a popular deep learning object detection model used in computer vision tasks. YOLO models have been widely used in various applications, including autonomous vehicles, surveillance systems, robotics, and more, where real-time object detection and classification are crucial. They offer a good balance between accuracy and speed, making them valuable tools in computer vision research and practical applications.

Links :

https://github.com/ultralytics/ultralytics
Train Yolov8 on a custom dataset by Computer vision engineer
Simple YOLOv8 Object Detection with Webcam in Real-time by Nicolai Nielsen

7. LangChain

LangChain is a framework that makes it easier to build applications using language models. It allows applications to do two key things: Be context-aware: This means connecting a language model to various sources of context, like prompt instructions, a few-shot examples, or relevant content, to help the model provide better responses. Reason effectively: LangChain helps applications rely on a language model to make informed decisions, such as how to respond based on the given context and what actions to take.

Links:

https://github.com/langchain-ai/langchain
https://github.com/msoedov/langcorn
LangChain Crash Course by codebasics
LangChain Complete Tutorial by Coding Crashcourses

8. Hugging Face

Hugging Face is a company and open-source community that is well-known for its contributions to natural language processing (NLP) and machine learning. They provide a wide range of NLP-related resources, including pre-trained language models, libraries, and tools. Their Transformers library is widely used in the NLP community for working with transformer-based models.

Links:

https://github.com/huggingface
https://huggingface.co/models

9. Bark

Bark is a transformer-based text-to-audio model. It can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.

Links:

https://github.com/suno-ai/bark
Bark AI Full Tutorial by 1littlecoder
Voice Cloning with Bark by Prompt Engineering

10. LocalGPT

LocalGPT is an open-source initiative that allows you to converse with your documents with complete privacy. With everything running locally, you can be assured that no data ever leaves your computer. It can Seamlessly integrate a variety of open-source models and is built on top of LangChain.

Links:

https://github.com/PromtEngineer/localGPT
Video by Prompt Engineering to get started with it

11. h2oGPT

Query and summarize your documents or just chat with local private GPT LLMs using h2oGPT. It can summarize private offline database of any documents such as PDFs, Excel, Word, Images, Code, Text, MarkDown. And supports a variety of models such as LLaMa2, Falcon, WizardLM, LORA etc. Also has UI and CLI too.

Links:

https://github.com/h2oai/h2ogpt
https://evalgpt.ai/
Video by Matthew Berman to get started with it

12. ChatDev Communicative Agents for Software Development

ChatDev stands as a virtual software company that operates through various intelligent agents holding different roles, including Chief Executive Officer , Chief Product Officer , Chief Technology Officer , programmer , reviewer , tester , art designer . These agents form a multi-agent organizational structure and are united by a mission to "revolutionize the digital world through programming." The agents within ChatDev collaborate by participating in specialized functional seminars, including tasks such as designing, coding, testing, and documenting. The primary objective of ChatDev is to offer an easy-to-use, highly customizable and extendable framework, which is based on large language models (LLMs) and serves as an ideal scenario for studying collective intelligence.

Links:

https://github.com/OpenBMB/ChatDev
Video By Mathew Berman on How To Get Started With It

---