Broadcast Engineer at BellMedia, Computer history buff, compulsive deprecated, disparate hardware hoarder, R/C, robots, arduino, RF, and everything in between.
4331 stories

A Tale of Two Phases and Tech Inertia

1 Share

What kind of power service is in the United States? You probably answered 120-volt service. If you thought a little harder, you might remember that you have some 240-volt outlets and that some industrial service is three phase. There used to be DC service, but that was a long time ago. That’s about it, right? Turns out, no. There are a very few parts of the United States that have two-phase power. In addition, DC didn’t die as quickly as you might think. Why? It all boils down to history and technological inertia.

Split Phase Power by Charles Esson CC-BY-SA 3.0

You probably have quite a few 120-volt power jacks in sight. It is pretty hard to find a residence or commercial building these days that doesn’t have these outlets. If you have a heavy duty electric appliance, you may have a 240-volt plug, too. For home service, the power company supplies 240 V from a center tapped transformer. Your 120V outlets go from one side to the center, while your 240V outlets go to both sides. This is split phase service.

Industrial customers, on the other hand, are likely to get three-phase service. With three-phase, there are three wires, each carrying the line voltage but out of phase with each other. This allows smaller conductors to carry more power and simplifies motor designs. So why are there still a few pockets of two-phase?

When Electricity Was New

It is easy to look back and realize that AC power transmission has advantages and why three-phase is used. But back when electricity was a new service, none of these things were obvious. Edison, Tesla, and Westinghouse famously battled between using AC and DC current. Back then, AC didn’t mean three-phase AC, though. Two-phase, where the phases were 90 degrees apart, was an easier system to analyze and generate. The famous generators at Niagara Falls, for example, produced two-phase. You can see ten 5,000 HP generators at the falls, below.

It was 1918 before mathematical tools for dealing with polyphase AC readily came about. By then, two-phase was pretty well entrenched. In many cases, once the superiority of three-phase was realized, things were just rewired. But high rise buildings were not always easy or practical to rewire.

Big City, Old Power

This was a similar situation with DC power. Did you know that Con Edison — New York City’s power company — still provided DC to some buildings until late 2007? Even then, the buildings didn’t switch everything to AC. They just installed converters so the DC motors that run infrastructure like the elevators didn’t need replacing. The conversion to AC started in 1928 and was supposed to take 45 years. Like most projects, it ran long and took nearly 80 years.

In the case of two-phase, though, there are still pockets of it in Philadelphia and Hartford Connecticut. This makes being an electrician in those cities a bit interesting and you can find services advertising their mastery of two-phase work. Incidentally, there are some breathtaking photographs of Philadelphia’s early twentieth century infrastructure. Take a look a the book Palazzos of Power: Central Stations of the Philadelphia Electric Company, 1900-1930.

You might wonder if the power companies in those two cities actually still maintain two-phase generators. As far as we can tell, no. They just convert from three-phase to two-phase using a Scott-T transformer (named after [Charles F. Scott] who worked for Westinghouse). You can see a typical configuration here.

March of Progress

We think of the march of technology as progressive, but it is amazing how many things hold on because of historical precedent. We still have AM radios, for example. My desktop computer can still boot MSDOS. There’s a lot of inertia even as new tech pushes out the old.

Why 120V? Because Edison’s first generators produced 110V (although, in fairness, 110V DC). After World War II, the nominal voltage kept creeping up until it settled on 120V by 1967. In 1899, a power company in Berlin decided to switch to 220V to increase its ability to distribute power. This took over Europe where 230V (raised up from 220) is the usual voltage.

Thanks to [Tom Frobase] who lived in Pennsylvania for suggesting this topic

Read the whole story
1 day ago
Burlington, Ontario
Share this story

Building a Portable Solar-Powered Spot Welder: Nearly Practical!

1 Share

Last time, we covered storing and charging a 3000 Farad supercapacitor to build a solar-powered, portable spot welder. Since then, I’ve made some improvements to the charging circuit and gotten it running. To recap, the charger uses a DC-DC buck converter to convert a range of DC voltages down to 2.6 V. It can supply a maximum of 5 A though, and the supercapacitor will draw more than that if allowed to.

Capacitor charge current decreases with time as the capacitor charges. Source: Hyperphysics

After some failed attempts, I had solved that by passing the buck converter output through a salvaged power MOSFET. A spare NodeMCU module provided pulse width modulated output that switched the MOSFET on for controlled periods of time to limit the charging current. That was fine, but a constant-voltage charger really isn’t the right way to load up a capacitor. Because the capacitor plates build up a voltage as it charges, the current output from a constant-voltage charger is high initially, but drops to a very low rate in the end.

To make something more like a constant-current charger, and lacking a sense resistor, I connected the output to the ADC pin on the NodeMCU. It measures the voltage across the supercapacitor, and as it increases during charging, the NodeMCU increases the amount of time the MOSFET allows current to pass. In other words it increases the duty cycle as the capacitor charges. Note that the firmware I was using supported integer math only, which is why I didn’t just divide by 1.6 in the code:

pwm.setup(1, 1000, 900)

function set_charge_rate()
val =
duty = 800 - (val/2 + val/9)
pwm.setduty(1, duty)

tmr.alarm(1, 3000, 1, function() set_charge_rate() end)

This worked much better, but the charge rate was still slower than it could be above around 2.1 volts. To speed it up a bit, I just increased the duty cycle to a fixed value above that point:

pwm.setup(1, 1000, 900)

function set_charge_rate()
val =
if val < 635 then duty = 800 - (val/2 + val/9) print (val) print (duty) pwm.setduty(1, duty) elseif val > 634 then
duty = 180
print (duty)
pwm.setduty(1, duty)


tmr.alarm(1, 3000, 1, function() set_charge_rate() end)

After that last modification, the charge rate was much better and the components involved would get hot, but not alarmingly so. I set the output of the DC-DC converter to 2.6 V, and was able to charge the capacitor past 2.5 V without issue.

Now that the charger was satisfactory, it was time to add electrodes. I had some large copper ring terminals, which were the right size for some steel bolts I had lying around. Crimping wire into these required quite a bit of hammering, but the connection was extremely solid. For cabling, I used three-phase power cable with all three wires attached together to make one thick cable.

I bolted the ring terminals to the copper plates of the supercapacitor holder and to two short, thinner solid core wires that were the electrodes. While ugly, this gave the electrodes good mobility. It was pretty easy to apply them to metal plates and such. Finally, where there was exposed copper, I used heat shrink tubing as insulation.

So far, everything had gone pretty well, so I charged it up, grabbed a spare lithium cell and some tab wire, and gave it a try. When I applied the electrodes, a small spot on the tab wire got yellow-hot very fast without sparking. I quickly removed them… and the tab wire didn’t adhere at all. No matter how I tried, I couldn’t get it to weld in place, although it did heat up whatever material I touched the electrodes to rather well without any part of the device getting particularly hot itself. It feels like it just barely didn’t work, which was a bit frustrating. It did nicely obliterate the tab wire if left on too long though:

In hindsight, there are a few things I could have done better from the start. Most importantly, I should have used both capacitors to make a 5.4 V, 1500 F capacitor bank. Since I don’t know the internal resistance of the supercapacitor (I had incorrectly guessed around 30 mΩ), I should have erred on the side of pessimism. Looking at this working build that uses supercapacitors to weld copper, they had used four supercapacitors of similar size! So I charged both capacitors, sanded the electrodes clean, and hooked them up in series in a box for a quick test.

What a difference that made! When the electrodes touched the tab wire, they sparked lightly as expected rather than just heating the metal, and it welded into place. Certainly not the best weld, but with some practice I think it could be serviceable. (The actual welds are those spots on the left. The tab was damaged from previous attempts.)

So while this works with a charge of  5.2 V, I suspect it would be better with a third capacitor at 7.8 V. A question remains though: why do some builds seem to work well with only a single supercapacitor? I suspect that these unbranded capacitors didn’t pass quality control, and were sold gray market. A higher than expected internal resistance or lower capacity than claimed are not out of the question and would certainly affect the performance of a spot welder. Lesson learned: for spot welders use genuine parts. I’ve half a mind to open it up, expecting to see a smaller capacitor inside along with a whole bunch of sand. That being said, for $4 these would have been fantastic as part of a trickle charge system for a solar-powered sensor.

Failing that, these questionable parts would make excellent ballast or a terrible casserole. Your project suggestions are welcome!

Separators to be added later.

In any case, it was time to put it in a better box, make it portable, and attach a solar panel. That turned out to be refreshingly straightforward. I went and bought a larger weatherproof plastic box to enclose the supercapacitors. They’re arranged just as if they were large AA batteries.

I bought some slotted angled steel. This is pretty much my favorite construction material, it’s more or less scaled up Meccano. An interesting fact is that the existence of Meccano is what prevented a general patent for slotted angled steel, which allowed it to be emulated worldwide. I’ve seen buildings made from it.

I built a steel frame to fit a solar panel on top and contain all the parts. This is so it can be strapped to a motorbike or worn as a backpack using bungee cord — inspired by [Joe Kim]’s art for the first article. The solar panel is a 10 W, 18 V monocrystalline unit that I kept around to charge devices during long power outages that used to occur weekly (the power grid is much better now).

In sunlight, the charge time is limited more by the charge controller (in software) than the solar panel output. At the specified limit of 5 A, it just runs too hot. As a compromise, each capacitor has its own charge circuit, which can easily take 15 minutes to get from 1 V to a practical voltage.

Overall it works… but calling it amazingly practical would be quite a stretch. A more modest build using three or four smaller, branded supercapacitors would have likely worked better, charged faster, as well as being lighter, smaller, and cheaper… perhaps to the point of borderline commercial viability for people who need to weld battery tabs when the power is out. Or I could just give up portability and solar power altogether and use a microwave transformer to build the spot welder.

Read the whole story
1 day ago
Burlington, Ontario
Share this story

Inventing The Digital Watch Again And Again And…

1 Share

In the 1950s, artwork of what the future would look like included flying cars and streamlined buildings reaching for the sky. In the 60s we were heading for the Moon. When digital watches came along in the 70s, it seemed like a natural step away from rotating mechanical hands to space age, electrically written digits in futuristic script.

But little did we know that digital watches had existed before and that our interest in digital watches would fade only to be reborn in the age of smartphones.

Mechanical Digital Watches

Cort&eacute;bert jump-hour wristwatch by Wallstonekraft CC-BY-SA 3.0
Cortébert jump-hour wristwatch.
Image by Wallstonekraft CC-BY-SA 3.0

In 1883, Austrian inventor Josef Pallweber patented his idea for a jumping hour mechanism. At precisely the change of the hour, a dial containing the digits from 1 to 12 rapidly rotates to display the next hour. It does so suddenly and without any bounce, hence the term “jump hour”. He licensed the mechanism to a number of watchmakers who used it in their pocket watches. In the 1920s it appeared in wristwatches as well. The minute was indicated either by a regular minute hand or a dial with digits on it visible through a window as shown here in a wristwatch by Swiss watchmaker, Cortébert.

The jump hour became popular worldwide but was manufactured only for a short period of time due to the complexity of its production. It’s still manufactured today but for very expensive watches, sometimes with a limited edition run.

The modern digital watch, however, started from an unlikely source, the classic movie 2001: A Space Odyssey.

The Watch Inspired By 2001: A Space Odyssey

2001: A Space Odyssey ClockThe next era of digital watches came in 1966 when Stanley Kubrick hired Hamilton Watch Company in Lancaster, Pennsylvania to make a futuristic clock for his upcoming movie, 2001: A Space Odyssey. The resulting clock was shaped like a squashed sphere and displayed time using digits from small Nixie tubes.

The clock never made it into the movie but it inspired its makers, John M. Bergey and Richard S. Walton, to work on a digital watch. They’d also worked together in Hamilton’s Military Division on an electronically timed fuse, technology which they’d thought about applying to watches and clocks.

HP 5082-7000 Numeric Indicator
HP 5082-7000 Numeric Indicator

Meanwhile, also in 1966, George H. Thiess founded a company in Texas called Electro/Data which produced solid-state microwave components and subsystems. Thiess had a dream of developing an accurate watch and devoted a small portion of the company’s resources to the project. After making a few prototype clocks as research for developing the watch, they hired Willie Crabtree from Texas Instruments to work on it as a project engineer and by 1969, had a prototype clock which used the new HP 5082-7000 LED module (PDF) for the digits.

By this time, Hamilton had also developed a digital clock but lacked the resources to scale it down to a watch. They contacted Electro/Data and by December 1969, the companies had an agreement to work on a digital watch together.

The watch they’d come up with was called Pulsar, named for the type of star which sweeps a beam of electromagnetic radiation across space at a precise rate of rotation.

"Pulsar watch from 1976" by Alison Cassidy CC-BY-SA 3.0
A Pulsar watch from 1976.
Photo by Alison Cassidy CC-BY-SA 3.0

By April 4th, 1972 they had a limited edition of 400 18-carat gold Pulsar watches selling for $2,100 — $12,500 in 2018 dollars. The watch used a quartz crystal for counting time and red LED’s for the display. To save power, the time was not always displayed. Instead, you’d press a button which would show the time for just over a second and if you continued to press, the display would change to show the seconds counting up.

However, there were problems with the 44-IC main computing module supplied by Electro/Data resulting in a recall. Eventually, Hamilton ended up making their own. There were also battery issues with the first watch, leading to the use of 2 specially made silver-oxide button cells, expected to last for a year of up to 25 readouts a day.

By 1975 some Pulsar models were selling for under $300 ($1400 today). At the peak, 150,000 were sold in 1976, but by then electronics companies were selling their own for under $100 ($450) and in 1977 they sold only 10,000.

Enter The Cheap LCD Watches

Casio LCD watch
Casio LCD watch by BBCLCD CC-BY-SA 4.0

The state-of-the-art for LCDs in the 1960s required a constant current, a voltage which was too high for batteries, and a mirror which limited the viewing angle. All that changed in the early 1970s with the TN-effect (Twisted Nematic field effect) which requires low voltage, low power, and no mirror.

By 1972, four-digit watches using these new low-power LCDs hit the market, and six-digits followed from Seiko in 1973. Even Intel got involved through a watch company they’d bought called Microma. Unlike the more power-hungry LED watches, these LCD ones could be always on, though often pressing a button turned on a small light for reading the digits in the dark.

Casio and Texas Instruments entered the market, producing large quantities. In 1976, Texas Instruments sold 18 million alone and soon the price fell below $20. Eventually, TI introduced one with a plastic watchband for just $9.95. As you’d expect, numerous companies weren’t able to compete and either stopped producing digital watches or closed down.

Smartwatches In The 1970s and 80s

It may surprise many to learn that smartwatches are not a new thing. The very first digital watch, the Pulsar, was thought of by its makers as a time computer and Hamilton’s subsidiary for making them was called Time Computer, Inc. They even released a limited edition 18 carat gold Pulsar with a built-in calculator selling for $3,500 with plans for a $600 stainless steel one. But it was in the 1980s that addons really hit the market.

Seiko TV Watch
Seiko TV Watch. Image source: HighTechies

Seiko came out with a TV watch. The TV display used a transflective LCD which worked with external light, the brighter the better. To watch TV, you plugged it into a receiver which fit in a pocket. The cable going to the headphones doubled as an antenna.

Casio produced a variety of watches with different features such as a built-in calculator, a thermometer, one with a 1,500-word dictionary for Japanese-to-English translation, and even one which could dial your home phone number. Citizen had one with voice control, and both Seiko and Timex released wristwatches which interfaced with computers.

But advancements in commercial digital watches had peaked. After this point most just told the time and date. That is until mobile phones came along.

Mobile Phones Killed The Watch

Mobile phones have displayed the time ever since their early days, bringing on a brief era when many wrists went bare. As watches wore out or broke there was no need to buy new ones. Younger generations grew up with mobile phones and had never even worn watches, neither analog nor digital.

Only folks who’d worn them all their lives had watches, or as in my case, had one which cost more to repair than the watch itself but kept it going for sentimental reasons. It seemed for a while as though the watch would eventually die out altogether.

The Rebirth Of The Watch

While mobile phones, and especially in their modern form, the smartphone, seemingly killed the watch, they were also partly responsible for its rebirth as the modern smartwatch.

One reason for the rebirth is that to look at a text message which just arrived or to check the time on a smartphone, you have to pull it out of a pocket or pick it up off a table. In a meeting, it’s less conspicuous to just twist your wrist and glance at something there. On a bicycle, there’s no need to stop and haul out a phone to check if a text message requires attention when a quick glance at your wrist will do.

Another reason is that smartphones brought on advancements in technology which were also useful for watches. Those include more robust glass, better touchscreens, displays, and operating systems for small devices.

Pebble Steel
Pebble Steel.
Image by Romazur CC-BY-SA 4.0

As with a lot of new products, a few modern smartwatches appeared but they didn’t really take off until the Pebble came along in 2012. Pebble raised a record $10.3 million on Kickstarter, showing that there was a sizeable market. It had a black and white ultra-low-power transflective LCD with a backlight, vibrator, accelerometer, and a battery life between four and seven days. There were over 1,000 apps on the pebble app store and it could communicate with a smartphone for that convenient glance to see why your smartphone is ringing. Unfortunately, the Pebble ceased production in 2016 when the company was bought by Fitbit for its IP.

Pebble’s success spurred on the introduction of many more smartwatches but the most notable was the Apple Watch in 2014, even though its main effect was arguably that it introduced more consumers to the market.

The jury’s still out on just how ubiquitous the smartwatch will become. For many, checking the time on a smartphone will remain the way while niches will continue to exist such as wrist-mounted health monitors. There does seem to be a resurgence in the number of people wearing analog watches, perhaps as the low tech alternative to hauling out a smartphone to check the time or because they’ve simply become fashionable again.

As someone who wore an analog watch for almost 50 years (except for the 70s when I wore a digital one), I stopped wearing my beloved Cardinal when the strap last broke and now use my smartphone when I’m away from the computer. Did you use a digital watch back in the day? Or perhaps you had one with a calculator? Or was a smartwatch your introduction to the digitized wrist? This being Hackaday, we’re of course expecting a few of the answers to be a DIY smartwatch, like this wooden one, or maybe a hacked Pebble.

Read the whole story
1 day ago
Burlington, Ontario
Share this story

CUDA is Like Owning a Supercomputer

1 Share

The word supercomputer gets thrown around quite a bit. The original Cray-1, for example, operated at about 150 MIPS and had about eight megabytes of memory. A modern Intel i7 CPU can hit almost 250,000 MIPS and is unlikely to have less than eight gigabytes of memory, and probably has quite a bit more. Sure, MIPS isn’t a great performance number, but clearly, a top-end PC is way more powerful than the old Cray. The problem is, it’s never enough.

Today’s computers have to processes huge numbers of pixels, video data, audio data, neural networks, and long key encryption. Because of this, video cards have become what in the old days would have been called vector processors. That is, they are optimized to do operations on multiple data items in parallel. There are a few standards for using the video card processing for computation and today I’m going to show you how simple it is to use CUDA — the NVIDIA proprietary library for this task. You can also use OpenCL which works with many different kinds of hardware, but I’ll show you that it is a bit more verbose.

Dessert First

One of the things that’s great about being an adult is you are allowed to eat dessert first if you want to. In that spirit, I’m going to show you two bits of code that will demonstrate just how simple using CUDA can be. First, here’s a piece of code known as a “kernel” that will run on the GPU.

void scale(unsigned int n, float *x, float *y)
  int i = threadIdx.x;

There are a few things to note:

  • The __global__ tag indicates this function can run on the GPU
  • The set up of the variable “i” gives you the current vector element
  • This example assumes there is one thread block of the right size; if not, the setup for i would be slightly more complicated and you’d need to make sure i < n before doing the calculation

So how do you call this kernel? Simple:


Naturally, the devil is in the details, but it really is that simple. The kernel, in this case, multiplies each element in x by the corresponding element in y and leaves the result in x. The example will process 1024 data items using one block of threads, and the block contains 1024 threads.

You’ll also want to wait for the threads to finish at some point. One way to do that is to call cudaDeviceSynchronize().

By the way, I’m using C because I like it, but you can use other languages too. For example, the video from NVidia, below, shows how they do the same thing with Python.

Grids, Blocks, and More

The details are a bit uglier, of course, especially if you want to maximize performance. CUDA abstracts the video hardware from you. That’s a good thing because you don’t have to adapt your problem to specific video adapters. If you really want to know the details of the GPU you are using, you can query it via the API or use the deviceQuery example that comes with the developer’s kit (more on that shortly).

For example, here’s a portion of the output of deviceQuery for my setup:

CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1060 3GB"
CUDA Driver Version / Runtime Version 9.1 / 9.1
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 3013 MBytes (3158900736 bytes)
( 9) Multiprocessors, (128) CUDA Cores/MP: 1152 CUDA Cores
GPU Max Clock rate: 1772 MHz (1.77 GHz)
Memory Clock rate: 4004 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 1572864 bytes
. . .
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes

Some of this is hard to figure out until you learn more, but the key items are there are nine multiprocessors, each with 128 cores. The clock is about 1.8 GHz and there’s a lot of memory. The other important parameter is that a block can have up to 1024 threads.

So what’s a thread? And a block? Simply put, a thread runs a kernel. Threads form blocks that can be one, two, or three dimensional. All the threads in one block run on one multiprocessor, although not necessarily simultaneously. Blocks are put together into grids, which can also have one, two, or three dimensions.

So remember the line above that said scale<<>>? That runs the scale kernel with a grid containing one block and the block has 1024 threads in it. Confused? It will get clearer as you try using it, but the idea is to group threads that can share resources and run them in parallel for better performance. CUDA makes what you ask for work on the hardware you have up to some limits (like the 1024 threads per block, in this case).

Grid Stride Loop

One of the things we can do, then, is make our kernels smarter. The simple example kernel I showed you earlier processed exactly one data item per thread. If you have enough threads to handle your data set, then that’s fine. Usually, that’s not the case, though. You probably have a very large dataset and you need to do the processing in chunks.

Let’s look at a dumb but illustrative example. Suppose I have ten data items to process. This is dumb because using the GPU for ten items is probably not effective due to the overhead of setting things up. But bear with me.

Since I have a lot of multiprocessors, it is no problem to ask CUDA to do one block that contains ten threads. However, you could also ask for two blocks of five. In fact, you could ask for one block of 100 and it will dutifully create 100 threads. Your kernel would need to ignore all of them that would cause you to access data out of bounds. CUDA is smart, but it isn’t that smart.

The real power, however, is when you specify fewer threads than you have items. This will require a grid with more than one block and a properly written kernel can compute multiple values.

Consider this kernel, which uses what is known as a grid stride loop:

void scale(unsigned int n, float *x, float *y)
 unsigned int i, base=blockIdx.x*blockDim.x+threadIdx.x, incr=blockDim.x*gridDim.x;
 for (i=base;i<n;i+=incr) // note that i>=n is discarded

This does the same calculations but in a loop. The base variable is the index of the first data item to process. The incr variable holds how far away the next item is. If your grid only has one block, this will degenerate to a single execution. For example, if n is 10 and we have one block of ten threads, then each thread will get a unique base (from 0 to 9) and an increment of ten. Since adding ten to any of the base numbers will exceed n, the loop will only execute once in each thread.

However, suppose we ask for one block of five threads. Then thread 0 will get a base of zero and an increment of five. That means it will compute items 0 and 5. Thread 1 will get a base of one with the same increment so it will compute 1 and 6.

Of course, you could also ask for a block size of one and ten blocks which would have each thread in its own block. Depending on what you are doing, all of these cases have different performance ramifications. To better understand that, I’ve written a simple example program you can experiment with.

Software and Setup

Assuming you have an NVidia graphics card, the first thing you have to do is install the CUDA libraries. You might have a version in your Linux repository but skip that. It is probably as old as dirt. You can also install for Windows (see video, below) or Mac. Once you have that set up, you might want to build the examples, especially the deviceQuery one to make sure everything works and examine your particular hardware.

You have to run the CUDA source files, which by convention have a .cu extension, through nvcc instead of your system C compiler. This lets CUDA interpret the special things like the angle brackets around a kernel invocation.

An Example

I’ve posted a very simple example on GitHub. You can use it to do some tests on both CPU and GPU processing. The code creates some memory regions and initializes them. It also optionally does the calculation using conventional CPU code. Then it also uses one of two kernels to do the same math on the GPU. One kernel is what you would use for benchmarking or normal use. The other one has some debugging output that will help you see what’s happening but will not be good for execution timing.

Normally, you will pick CPU or GPU, but if you do both, the program will compare the results to see if there are any errors. It can optionally also dump a few words out of the arrays so you can see that something happened. I didn’t do a lot of error checking, so that’s handy for debugging because you’ll see the results aren’t what you expect if an error occurred.

Here’s the help text from the program:

So to do the tests to show how blocks and grids work with ten items, for example, try these commands:

./gocuda g p d bs=10 nb=1 10
./gocuda g p d bs=5 nb=1 10

To generate large datasets, you can make n negative and it will take it as a power of two. For example, -4 will create 16 samples.

Is it Faster?

Although it isn’t super scientific, you can use any method (like time on Linux) to time the execution of the program when using GPU or CPU. You might be surprised that the GPU code doesn’t execute much faster than the CPU and, in fact, it is often slower. That’s because our kernel is pretty simple and modern CPUs have their own tricks for doing processing on arrays. You’ll have to venture into more complex kernels to see much benefit. Keep in mind there is some overhead to set up all the memory transfers, depending on your hardware.

You can also use nvprof — included with the CUDA software — to get a lot of detailed information about things running on the GPU. Try putting nvprof in front of the two example gocuda lines above. You’ll see a report that shows how much time was spent copying memory, calling APIs, and executing your kernel. You’ll probably get better results if you leave off the “p” and “d” options, too.

For example, on my machine, using one block with ten threads took 176.11 microseconds. By using one block with five threads, that time went down to 160 microseconds.  Not much, but it shows how doing more work in one thread cuts the thread setup overhead which can add up when you are doing a lot more data processing.


OpenCL has a lot of the same objectives as CUDA, but it works differently. Some of this is necessary since it handles many more devices (including non-NVidia hardware). I won’t comment much on the complexity, but I will note that you can find a simple example on GitHub, and I think you’ll agree that if you don’t know either system, the CUDA example is a lot easier to understand.

Next Steps

There’s lots more to learn, but that’s enough for one sitting. You might skim the documentation to get some ideas. You can compile just in time, if your code is more dynamic and there are plenty of ways to organize memory and threads. The real challenge is getting the best performance by sharing memory and optimizing thread usage. It is somewhat like chess. You can learn the moves, but becoming a good player takes more than that.

Don’t have NVidia hardware? You can even do CUDA in the cloud now. You can check out the video for NVidia’s setup instructions.

Just remember when you create a program that processes a few megabytes of image or sound data, that you are controlling a supercomputer that would have made [Seymour Cray’s] mouth water back in 1976.

Read the whole story
1 day ago
Burlington, Ontario
Share this story

Classic 1950s IBM Short Film on the Making of the RAMAC, the First Computer with Magnetic Disk Storage | #retrocomputing

1 Share

Check out the classic IBM video below detailing the inception of the first magnetic disk storage computer, the RAMAC (Random Access Method of Accounting and Control) – mind you the ‘hard disk’ is fifty 24″ platters! The video is filled with some pretty silly shots, such as the RAMAC installed on a dock adjacent to some water, or some ‘cuts’ of men walking back and forth between rooms to suggest the passage of time. On that note the video is filled with plenty of male-centric shots and suggestions, like only hiring men out of college – “each man selected” – and men dictating ideas that only women write down. (Here at Adafruit it’s safe to say we’ve dismantled that stereotype.) But it does also show some interesting backstory, such as suggesting magnetic ‘disk’ design, and the arm that searches and retrieves data from it, was at least in part inspired by the phonograph and its ability to quickly ‘seek’ ahead on a vinyl record with a simple lift and relocation of the stylus. (Did I mention there’s a shot of the RAMAC situated on a dock adjacent to water?)

Anyhow, enjoy:

At 2:28 in the video above you can see the following building where the creation of the hard disk took place – the building was designated a historic landmark (PDF) in 2002 by the San Jose City Council. The sign on the building suggests it was currently available to lease as recent as October 2017 (via Street View):

Segments from the archival IBM film above can be seen in this more-modern voice-over video by Michael Bazeley for the San Jose Mercury News – along with additional footage and high-res shots woven in:

If you’ve scrolled this far you might want to read more still here at PC World.

Read the whole story
2 days ago
Burlington, Ontario
Share this story

The Universe from Animaniacs and Monty Python #SaturdayMorningCartoons

1 Share


via MelodySheep

Sometimes you just need some perspective. So here’s and oldie: a song the size of the cosmos, from the dear departed Carl Sagan and Stephen Hawking.

Each Saturday Morning here at Adafruit is Saturday Morning Cartoons! Be sure to check our cartoon and animated posts both nostalgic and new that inspire makers of all ages! You’ll find how-tos for young makers, approaches to learning about science and engineering, and all sorts of comic strip and animated Saturday Morning fun! Be sure to check out our Adafruit products featuring comic book art while you’re at it!

Read the whole story
2 days ago
Burlington, Ontario
Share this story
Next Page of Stories