I want to extract and process the metadata from PNG images and the first line of .safetensors files for LLM’s and LoRA’s. I could spend ages farting around with sed or awk but formats of files are constantly changing. I’d like a faster way to see a summary of training and a few other details when they are available.
Specifically this version of yq - there are other versions bundled with distros that look and act very differently and lack the potency of this version.
Seriously, can’t get those 15 minutes back.
And there is htmlq too, if you ever need to scrape some stuff from a website :)
Naw, everybody knows that you have to use regex for that
I have a very handy command in my .vimrc for this -
command! JSON setlocal filetype=json | %!jq .
Anytime I’m in a json file that isn’t formatted it’s as simple as typing
:JSON
to have it all sorted.
I found a Python project that does enough for my needs. Jq looks super powerful though. Thanks. I managed to get yq working for PNG’s, but I had trouble with both jq and yq with safetensor files. I couldn’t figure out how to parse a string embedded in an inconsistent starting binary, and with massive files. I could get in and grab the first line with head. I tried some stuff with expansions, but that didn’t work and sent me looking for others that have solved the issue better than myself.
Nushell is pretty nice.
Yeah, I’ve been learning some nushell. If you’re dealing with data, it’s just a great tool. So many sharp edges in the POSIX shell come from it being stringly typed, so having a strongly typed shell is extremely helpful.
jq
I’d probably go Python but I’m an idiot
What are some goos resources for learning jq? I really struggle when it comes to nested keys/values which obviously limits my ability to use it.
I hate to do this, but AI chatbots are typically pretty good at giving examples for things like this and you can learn from it.
AI chatbots are very good for teaching. I’ll give them that.
I definitely use them a lot, but I think “very” is too strong a word. It’s pretty easy to get confident, contradictory information from them. They’re a good place to start and brainstorm, but all the information has to be verified either by running and testing the code, or by finding a human source.
True. I wouldn’t use them for very complicated stuff. I currently use them for “what is x?” and “how is x different from y?” kinds of question.
One advantage of using an AI is that it removes a lot of fluff that you get on blogs. However, that can change very soon when our AI overlords figure out monetization.
man jq
I have perused it, but its both so dense and so broad that its not that helpful unless i know exactly what I’m looking for. I have also tried info and tldr. I actually like tldr the most,. although the exhaustiveness of the man pages must be admired. I dont find it to be the best teacher.