Broadcast Engineer at BellMedia, Computer history buff, compulsive deprecated, disparate hardware hoarder, R/C, robots, arduino, RF, and everything in between.
4856 stories
·
5 followers

Using machine learning to pull Krazy Kat comics out of giant public domain newspaper archives

2 Shares

Joël Franusic became obsessed with Krazy Kat, but was frustrated by the limited availability and high cost of the books anthologizing the strip (some of which were going for $600 or more on Amazon); so he wrote a scraper that would pull down thumbnails from massive archives of pre-1923 newspapers and then identified 100 pages containing Krazy Kat strips to use as training data for a machine-learning model.

After a couple of false starts, which Franusic documents, he was able to train a model by feding the 100 "krazy"-containing thumbnails and a set without Krazy Kat thumbs that he labeled as "negative" to a Microsoft Custom Vision algorithm. He shelled out $180 for Microsoft's "Advanced Training" to be applied to his data, then set the model it produced loose on the remaining thumbnails.

The model crunched through the remaining thumbnails, then Franusic automated the download of full-sized scans from pages identified as likely to contain a Krazy Kat comic. When the dust settled, he had hundreds of Krazy Kat comics in a folder, including one strip that does not appear in any published book that Franusic was able to find.

Franusic has done an excellent job of summarizing his process notes, including source code, and has offered to share a complete set of notes with anyone who wants to build on his work. He's also produced a set of recommendations for people trying this kind of work in future, as well as a wishlist for newspaper archivists who are hoping that projects like this will surface interesting things in their archives.

Things I wish I could have done

* Train an image classifier to recognize the comic boundaries

I included the full newspaper scans on this site because I didn't want to crop all of those images by hand, and also because that's a job that a good image classifier could do automatically?

Train an image classifier that can find all types of comics

I was only interested in Krazy Kat comics that were published on Sunday. In the process of manually looking through the archives, I ran across many other interesting comics that I would have liked to have extracted too. The Katzenjammer Kids and Winsor McCay's comics in particular.

Train an image classifier that can find the daily Krazy Kat comics

From what I can tell, most of the daily Krazy Kat comics haven't been published. Doing this would allow the world to see thousands of new Krazy Kat comics.

Investigate approaches to automatically restore comics

Given that we have scans of original artwork available, as well as the ability to pull the same comic from several archives, doing automatic comic restoration seems like an approach that's worth investigating.

Krazy Kat Comics [Joël Franusic]

(via Four Short Links)

Read the whole story
tekvax
12 days ago
reply
Burlington, Ontario
Share this story
Delete

Generrate Cryptographically Secure RANDOM PASSWORD

1 Share
$ python -c "import string; import random;print(''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits + string.ascii_lowercase) for _ in range(16)))"
Explination: https://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits/23728630#23728630 Why 16 Characters: https://www.wired.com/story/7-steps-to-password-perfection/

commandlinefu.com

Diff your entire server config at ScriptRock.com

Read the whole story
tekvax
23 days ago
reply
Burlington, Ontario
Share this story
Delete

Generrate Cryptographically Secure RANDOM PASSWORD

1 Share
$ cat /dev/urandom |tr -c -d '[:alnum:]'|head -c 16;echo
Change :alnum: to :graph: for all printable characters

commandlinefu.com

Diff your entire server config at ScriptRock.com

Read the whole story
tekvax
23 days ago
reply
Burlington, Ontario
Share this story
Delete

Alert visually until any key is pressed

1 Share
$ while true; do echo -e "\e[?5h\e[38;5;1m A L E R T $(date)"; sleep 0.1; printf \\e[?5l; read -s -n1 -t1 && printf \\e[?5l && break; done

commandlinefu.com

Diff your entire server config at ScriptRock.com

Read the whole story
tekvax
23 days ago
reply
Burlington, Ontario
Share this story
Delete

Tiny Tank Inspects Your Crawlspace

1 Share

If you’ve got some drone or FPV part lying around, this is the build for you. It’s a remote controlled tank, with a camera and video transmitter, that’s only 65 mm x 40 mm x 30 mm in size. Why on Earth would you ever build something so small? You can look around in your crawlspace, I guess. Any way you look at, this thing is tiny.

The tank has traditional tank skid steering through two brushless motors. The battery is one cell, as that’s just about the largest battery you can put in a vehicle so small, and the camera is just off-the-shelf quadcopter stuff set into a 3D printed enclosure. There are a few LEDs for lights. Other than that, it’s just so tiny and so cute.

The builder behind this tank, [honnnest], put up a video going through the build and demonstrating what kind of video you can expect from a tank this small. It’s a bit fast for a tank, and that’s not even considering the scale effects, but if the chassis is 3D printed, you can always print a few reduction gears, too.

Read the whole story
tekvax
23 days ago
reply
Burlington, Ontario
Share this story
Delete

This trig problem kept me up too late last night

1 Share

My daughter is taking a precalc summer school course. Last night she was doing her homework, which was about verifying trigonometric identities. Out of the 25 homework problems, there was one that she got stuck on. I decided to give it a try and spent two hours on it without solving it.

Here it is. Verify the identity:

(sec x - tan x)² = (1 - sin x )/(1 + sin x )

You don't need to know anything about trigonometry to solve this. All you need to know are the fundamental trigonometric identities, which are:

My daughter is in class now and she texted me the answer. There's not too many steps involved. Let's see how fast you can solve it.

Read the whole story
tekvax
23 days ago
reply
Burlington, Ontario
Share this story
Delete
Next Page of Stories