InattentiveRaccoon

joined 1 year ago
MODERATOR OF
 

I've been working on my own fork of a set of Rust language wrappers around the famous llama.cpp project. This blog post covers my motivations, what I've added and a sample project to get readers started.

 

I've finally released the AI chat software I've been working on. I'll try to write a blog post about it at some point, but until then, you can find more information at the github repo.

Sentient_Core is a local llm AI application so all of the text inference is done on your machine. Optionally, it can support calling out to koboldcpp through it's API so it is possible to do the heavy compute on a different machine - or use more up to date model formats because rustformer's llm project is still trying to merge a current version of ggml after being behind for months.

If you have any questions or comments, feel free to ask away!

[–] [email protected] 2 points 11 months ago* (last edited 11 months ago) (2 children)

I got PETG working on my Mini clone by basically just switching to 0.6 nozzle profiles from the wizard and using the generic PETG profile.

But, if you’re new to PETG, know that it gets real moisture sensitive, real fast. Sounds like you might benefit from trying to dry the filament out, check the web for more on that.

After making sure the filament was dry and that my initial layer is good, because changing nozzles means you need to fine tune the z offset again, I would print a retraction tower test and dial in retraction length settings.

That a good basic troubleshooting list.

[–] [email protected] 2 points 11 months ago (1 children)

I haven’t posted much to lemmy, but already from the title of this community, I would expect more negativity than average because ‘actually useful’ is relative and frankly, I cant be bothered to make my case. I guess I let my reddit experience, as somone there from it’s beginning when it started to take over digg, color my viewpoint of humanity on social networks.

 

Okay! I finally got a 'killer feature' prototyped in my terminal based AI chat application ... vector embeddings! I got it generating them from the chat log and searching for semantic relevance appropriately. Took a bit to sort this out because of the technology stack I'm using... But now it works.

Life's gonna be rough for the next few days, but late next week I hope to actually write some more useful blog posts containing the stuff I've learned. I can't wait!

🤩

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

Go was my favorite mainstream language for GSD for years last decade. I was happy using it.

I came off a development ‘extended holiday’ by jumping into Rust dev over the last few months. I think I’m less happy. Though my software is arguably functionally better, even if I fucking despise the way it looks and reads.

Maybe I just never realized I'm a grug brained developer at heart.

[–] [email protected] 3 points 1 year ago (3 children)

yeah, without the config, it’ll be hard to say anything meaningful. if you are open to alternatives, caddy does this well and is super easy to configure.

[–] [email protected] 3 points 1 year ago

I love the design of this ergo!

 

I finally finished writing this one up. My previous post here was about 1 minute after I got the finetuned model working for the first time, so it's taken me until now to put it all together. I'm still not super happy with the writing in this one, but I'm kinda 'over' the writing process right now ...

If you spot any errors, or have any questions, feel free to reply.

[–] [email protected] 1 points 1 year ago

I'm just too tired to deal with writing up everything until Friday, since I have a lot of hours at my day job this week.

But, if by chance anyone is watching this, you can see the code that I checked into the dev branch of Kontour and the configuration file used to generate the datasets I used as well as the scripts in data-cleaning to reformat everything.

https://github.com/tbogdala/kontour/tree/dev

 

Holy crap! I've finally done it. I've generated a dataset of conversations (all locally), cleaned them up and then finetuned it on open-llama-7b (just to test) and IT WORKED! AHHHHH! happy noises

Okay I gotta go to sleep now. I have to get up for work in less than five hours. I'll be cleaning up everything, commiting code and doing the write up this week.

[–] [email protected] 2 points 1 year ago

I also have a sub to di.fm and love it. A new one I'm thinking of adding is brain.fm which I've been enjoying on trial so far.

[–] [email protected] 3 points 1 year ago

I like Buttercup. It's open source and pretty simple to use. I personally just keep mine on dropbox so my mac, linux, ios, windows and android devices can all access it. https://github.com/buttercup/buttercup-desktop

[–] [email protected] 3 points 1 year ago

Don't write anything other people can see. Don't share any art you make. Never publish code online. Don't post to large social media networks.

[–] [email protected] 3 points 1 year ago (4 children)

Thanks for those, btw. Both docker and sql are things that I'm not super familiar with, so it was nice to have a guide.

1
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

I've been trying to make my own qlora finetunes for a little while now, but progress has been slow. For a long while I couldn't get anything to come out right besides the examples, so here's a blog post on the progress I made.

[–] [email protected] 4 points 1 year ago

What I like the most about tree supports is that I can actually remove them pretty easy compared to the usual fiasco I had to deal with in prusa slicer. So happy support for them finally landed in 2.6.

1
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

I wrote some notes down into a blog entry after getting Lemmy and Caddy to play with each other from separate docker-compose files. Nothing advanced here, but if you're like me and Docker only pops in and out of projects occasionally, this might be a helpful guide for what changes to expect.

1
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

This is my step-by-step guide on how to replicate fine tuning of the example datasets using axolotl.

Last I checked, the bitsandbytes library copy was still needed and open-llama-3b was still problematic for quantizing, but hopefully those issues are solved at some point.

What I didn't know when I first wrote the post was that it was possible to load the finetuned LoRA file in a frontend like text-generation-webui. I have since updated the text to account for that. There are performance side-effects of just loading the qlora adapter in the webui besides just the penalty to load time. This should show how fast text inference was with little context in tokens/p while using the transformers library and source model in f16 or quantized 8-bit & 4-bit and how fast I can run a merged q4_0 quantization.

1
The Blog Is Back (animal-machine.com)
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 

I did it again and started a blog. Even further, I spun up a Lemmy instance so that I could control my online account in the fediverse. While doing so it dawned on me that it'd be the perfect way to host discussions on my blog posts!

While I'm pulling out of a variety of online spaces, I'm moving to solidify my identity in the fediverse spaces. So I'm @[email protected] here ... and currently @tbogdala over at mastodon.gamedev.place. The problem is that I have a lot of interests and I just feel weird tooting about all of them on a 'gamedev' instance. Sigh... I might host my own too, later, I don't know. All of this is a little stream-of-consciousness but that should cover the intro bit.

view more: next ›