Mid-year Updates from FollowFox.ai

Hello, FollowFox community!

About six months ago, we wrote about our plans for 2023 (link). A lot has happened since then. Some things went as planned, some unexpected, but most importantly - we are as active as ever and growing daily!

We decided to use this mid-year point to reflect on the last six months and set the tone for the rest of the year.

Our Blog/Newsletter

Since January, the reach and audience of our blog have grown more than 10x! A huge thanks for your interest and support. It is reassuring to know that our stance to openly share pretty much everything we work on is relevant and useful for so many people.

We plan to continue with this work as planned earlier in the year - this resource will remain free, we will be writing at least once a week with occasional additional posts, and the type of content will mostly be along the same topics as before.

Special thanks to all of our subscribers! We acknowledge and notice each of you, and the open rates of our newsletter emails that are by all benchmarks beyond top class mean a lot to us.

Extra special thanks to the bunch of people who pledged money to our newsletter. As mentioned, we plan to keep this resource free, so we will not be claiming those pledges, but such support is beyond anything we expected earlier this year.

Finally, we want to acknowledge direct and indirect contributors to our content. Whether it is a simple comment of feedback, the private discussions we have been having across different channels, or sharing your datasets, models, and approaches with us - it has been a huge contribution to improving the overall quality of the content.

Followfox.ai Team and Mindset

Since January, our team has been shaped into three likely-minded, dedicated individuals. On top of the core group, we have collaborated with many individuals, from advisors to task-specific freelancers and experts.

To this date, followfox.ai is 100% self-funded. We plan to stay as such. While preferred, this is not some stubborn stance - we can imagine specific needs, circumstances, and arrangements arising that could change this.

In the meantime, a few perspectives and beliefs have solidified or arisen within our team during the last six months. Let’s start with the two that we have already mentioned before:

We still believe in a future powered by hyper-customized AI solutions for different needs and tasks. And followfox.ai aims to play a role in making that future happen by allowing broader audiences to get access to such customized AI solutions.
We believe that reducing complexity and bringing convenience are the two key aspects to open the doors for broader audiences to the power of generative AI.

And a few new themes that we have been building around:

Open-source: if we initially saw open-source as something that enabled our existence, today we are even more bullish on it, and we think it will be the winning strategy in many aspects of generative AI. Check out our post for why we believe AI will be Open-Source (link).
Flexibility: no matter how you put it - this space is extremely unpredictable. The core cash cow at the begging of the week can become just another capability of a free-to-use released tool by the end of the week. And with several indicators that taking preventive measures against such disruptions will be increasingly difficult, we are building followfox.ai with flexibility at its core to stay relevant and successful.

Academy - Practical Introduction to Generative AI

Making a course was not part of our plans, and we started considering creating it due to continued interest and asks. We are well aware that while relatively accessible, our content is still tailored to the enthusiast community that wants to advance the collective capabilities of this very community.

However, another group of individuals are interested in the space but need some support to get started with hands-on experiences. So our academy, for now, will focus exactly on that - a very practical starting point who want to join this enthusiastic community.

If this interests you, please let us know by joining our waitlist.

Stable Diffusion Related Work

SD remains one of our core focus areas, and there are many reasons, ranging from its accessibility to our accumulated knowledge, the interest of our audience, its practical applications, it being a great onboarding tool for gen-AI space, and many more.

Like the rest of the community, we actively watch and wait for the SDXL release. There are reasons why we all should be excited for this release but also a few good reasons for concerns and reservations. We will experiment with it but won’t commit to heavily adopting it until we see more.

Independent of the SDXL release, we will continue working on the roadmap that we published couple of weeks ago:

Vodka Series:

Vodka V3 (complete, link) - adding tags to captions to see their impact
Vodka V4 (complete, link) - addressing the ‘frying’ issue by decoupling UNET and Text Encoder training parameters
Vodka V5 (data cleaning state) - we have gathered a much larger dataset and will put a lot of emphasis on data prep. This series will be almost a new starting point for further experimentation.
Vodka V6 (TBD) - re-captioning the whole data to see the impact of using AI-generated captions vs. original user prompts
Vodka V7+, for now, is a parking lot for a bunch of ideas, from segmenting datasets and adjusting parameters accordingly to fine-tuning VAE, adding specific additional data based on model weaknesses, and so on.

Cocktail Series:

These models will be our mixes based on Vodka (or other future base models).

Bloody Mary V1 (complete, unreleased) - Our first mix is based on Vodka V2. Stay tuned for this: Vodka V2 evolved from generating good images with the proper effort to a model where most generations are very high quality. The model is quite flexible and interesting.
Bloody Mary V2+ (planned): nothing concrete for now except for ideas based on what we learned from V1 and improvements in Vodka base models.
Other cocktails (TBD) - we have plans and ideas to prepare other cocktails, but nothing is worth sharing for now.

LORAs, Textual Inversions, and other add-ons:

We have started a few explorations on add-on type releases to boost the capabilities of our Vodka and Cocktail series, so stay tuned for them.

Training the first LORAs (complete, link) - very basic starting guide on how to train a LORA paired with releasing the photorealistic enhancement for our Vodka models. This was a lazy, quick attempt, and while the results look impressive, there is a lot more to learn and explore

Language and LLMs

For now, our work on this front is relatively limited and is concentrated in two areas. First, it is the practical application of some available tools in our workflows, especially for coding. Many of these tools are awesome, and using them is the best way to stay up-to-date on what’s possible.

Secondly, we are actively monitoring what’s available in the open-source space and waiting for the “Stable Diffusion moment” of LLMs. We have run and tested almost every noticeable open-source LLM-related release to date. There are a lot of cool things there. Still, given the current practicality/capabilities, we decided not to prioritize deeper explorations over other things we are working on.

One language-related area that we are working on is Stable Diffusion use cases. Image captioning and user prompt pre-processing before image generation are the two areas we have been exploring for a while.

Distillery and Other Projects

There are a few more commercial (both b2b and b2c) and non-commercial projects that we have been working on.

Archeology

For example, we have been collaborating with Professor Maurizio Forte from Duke University (link) to explore applications of generative AI in archeology. As a result of this collaboration, our text has been part of the workshop by Cineca (the largest Italian computing center) on AI and Cultural Heritage. The full text is here - our part starts from page 43 (link). We also experimented with creating immersive viewing experiences of ancient monuments and brought characters from The Pyrgi Relief to life. The lecture on the topic was presented at Chicago Field Museum. Read more here (link) or check out one of the video outputs that we created.

Distillery

We will be sharing a lot more very soon, but for now we will tease that this will be our first broader audience-facing project, and the MVP version of the product is undergoing its final tests.

Everything Else

We have a few more things that we are actively working on, but we won’t be able to share any more details for now. Either because the timeline is too far or we have yet to commit to those projects fully. Stay tuned for future updates, and we promise to keep things exciting!