My First Adventure with Stable Diffusion

My First Adventure with Stable Diffusion

8 Feb 2022

8 Feb 2022

Klaudia Wereniewicz

Klaudia Wereniewicz

My First Adventure with Stable Diffusion - cover graphic
My First Adventure with Stable Diffusion - cover graphic
My First Adventure with Stable Diffusion - cover graphic

Using AI tools to generate images can be really fun! In this article you can find my journey of using DALL-E and Midjourney and how my little experiment went.

Recently, I have been particularly interested in the intersection of AI and Stable Diffusion. One area that caught my attention was the ability to generate unique images based on detailed descriptions using free tools. DALL-E and Midjourney are two such tools that offer a range of possibilities, but there are limits beyond which one can only access by paying a membership fee if needed. As a deaf individual who has received two cochlear implants, I was particularly excited to try generating an avatar of a girl with two implants. However, as someone with limited knowledge of the subject, and it is my first time attempting such a task, I was not quite sure what to expect. The results, however, were quite surprising.


The Journey


First steps…

In the beginning, I tried DALL-E and put the prompt: back of the head of a deaf woman bilaterally implanted with 2 cochlear implants, neon glow at night around 3d rich colours.

Here, you can see the results:


And it was awful, it looks like a character from my childhood fairy tale. I had to move on and do another thing.


…Midjourney…

As the heading says, Midjourney was the generator I tried in the next step, I have asked for outputs based on this prompt: back of the head of a deaf bilaterally implanted woman with long dark blond hair and cochelar implants.

And it went like this:


Once again, I am not happy with the result. The output was terrible because there was a typo with cochlear wording, so the result turned out quite creepy (weird ears, unrealistic hair), I was shocked, thinking “what is this?”.


…Let’s have another try…

The style of Midjourney convinced me to put another prompt: back of the head of a deaf white blond woman bilaterally cochlear implant who is wearing two cochlear implant speech processors.

Here you can see the generated images:


This looks pretty good! The devices look very interesting and totally different than I had imagined, but AI was struggling with putting ears on the back of the head. Don’t they look funny?


…Abstraction is the way!

After these attempts, I decided to go more abstract — cyberpunk or neon style, so I put the prompt: back of the head of a deaf girl bilaterally implanted with 2 cochlear implants on a light yellow background, neon glow at night around 3d rich colours in Avatar style, Maya style, jungle.

Check out the results:


For me, the results were the most interesting as the output was really creative. I loved it, awesome - I have finally got what I wanted!


Let’s also have some fun!

Just to mess with the Midjourney I decided to give a try with a dog instead of a girl and the results were adorable after entering the prompt: ultra realistic deaf shiba inu with cochlear implants on a pink background.

Check for yourself:


Summary

In conclusion, it's essential to be mindful of typos as they can significantly affect the output and lead to results that are vastly different from what was expected. My experience revealed that AI still struggles with placing ears on the head or neck in atypical poses, which can make it clear that the image was generated by a machine, as the body and anatomy can appear relatively poor. However, I found that the cyberpunk or "jungle" style generated images were much more successful than a realistic style.

In my opinion, AI is currently better at creating highly creative images rather than real photos. It's worth noting that these images were generated on January 10th, 2023 and that both Midjourney and DALL-E are currently in beta, and the experience is still evolving significantly.

Using AI tools to generate images can be really fun! In this article you can find my journey of using DALL-E and Midjourney and how my little experiment went.

Recently, I have been particularly interested in the intersection of AI and Stable Diffusion. One area that caught my attention was the ability to generate unique images based on detailed descriptions using free tools. DALL-E and Midjourney are two such tools that offer a range of possibilities, but there are limits beyond which one can only access by paying a membership fee if needed. As a deaf individual who has received two cochlear implants, I was particularly excited to try generating an avatar of a girl with two implants. However, as someone with limited knowledge of the subject, and it is my first time attempting such a task, I was not quite sure what to expect. The results, however, were quite surprising.


The Journey


First steps…

In the beginning, I tried DALL-E and put the prompt: back of the head of a deaf woman bilaterally implanted with 2 cochlear implants, neon glow at night around 3d rich colours.

Here, you can see the results:


And it was awful, it looks like a character from my childhood fairy tale. I had to move on and do another thing.


…Midjourney…

As the heading says, Midjourney was the generator I tried in the next step, I have asked for outputs based on this prompt: back of the head of a deaf bilaterally implanted woman with long dark blond hair and cochelar implants.

And it went like this:


Once again, I am not happy with the result. The output was terrible because there was a typo with cochlear wording, so the result turned out quite creepy (weird ears, unrealistic hair), I was shocked, thinking “what is this?”.


…Let’s have another try…

The style of Midjourney convinced me to put another prompt: back of the head of a deaf white blond woman bilaterally cochlear implant who is wearing two cochlear implant speech processors.

Here you can see the generated images:


This looks pretty good! The devices look very interesting and totally different than I had imagined, but AI was struggling with putting ears on the back of the head. Don’t they look funny?


…Abstraction is the way!

After these attempts, I decided to go more abstract — cyberpunk or neon style, so I put the prompt: back of the head of a deaf girl bilaterally implanted with 2 cochlear implants on a light yellow background, neon glow at night around 3d rich colours in Avatar style, Maya style, jungle.

Check out the results:


For me, the results were the most interesting as the output was really creative. I loved it, awesome - I have finally got what I wanted!


Let’s also have some fun!

Just to mess with the Midjourney I decided to give a try with a dog instead of a girl and the results were adorable after entering the prompt: ultra realistic deaf shiba inu with cochlear implants on a pink background.

Check for yourself:


Summary

In conclusion, it's essential to be mindful of typos as they can significantly affect the output and lead to results that are vastly different from what was expected. My experience revealed that AI still struggles with placing ears on the head or neck in atypical poses, which can make it clear that the image was generated by a machine, as the body and anatomy can appear relatively poor. However, I found that the cyberpunk or "jungle" style generated images were much more successful than a realistic style.

In my opinion, AI is currently better at creating highly creative images rather than real photos. It's worth noting that these images were generated on January 10th, 2023 and that both Midjourney and DALL-E are currently in beta, and the experience is still evolving significantly.

Start a project

Are you a changemaker? Elevate your vision with a partnership dedicated to amplifying your impact!

💌 Join our newsletter

Receive insightful, innovator-focused content from global product experts — directly in your mail box, always free

Address & company info


Chmielna 73B / 14,
00-801 Warsaw, PL

VAT-EU (NIP): PL7831824606
REGON: 387099056
KRS: 0000861621

💌 Join our newsletter

Receive insightful, innovator-focused content from global product experts — directly in your mail box, always free

Address & company info


Chmielna 73B / 14,
00-801 Warsaw, PL

VAT-EU (NIP): PL7831824606
REGON: 387099056
KRS: 0000861621

💌 Join our newsletter

Receive insightful, innovator-focused content from global product experts — directly in your mail box, always free

Address & company info


Chmielna 73B / 14,
00-801 Warsaw, PL

VAT-EU (NIP): PL7831824606
REGON: 387099056
KRS: 0000861621