The TransistoriZed logo should be here! But... curses! Your browser does not support SVG!

homelist of postsdocs about & FAQ

On the phenomenon of rubbish patents

a train of thoughts written on a train

You are starting your next venture with a brief search on what has been done before. Stumbling across some patents you get excited and start exploring the database vigorously. It isn't too late that you realize that something doesn't feel quite right – nearly 2/3 of the patents you come across are vague and do not make sense. You then morph into a questioning phase, but why?

All granted patents in the databases have a certain role. They are like the bureaucrats in the public sector: some are completely useless, others a bit pointless, you can also find a few good ones. But why is it that the majority is rubbish? The short answer would be that a patent's purpose is to protect someone's intellectual property from misuse, but a large portion are instead used for image forgery and lobbying. The long answer, however, is a bit more involved and perhaps cannot be answered by a few lines. But, let me try to elaborate on what I think can be defined as rubbish, and later on state my views for the utopian world of patents.

So, what's a rubbish patent anyway? These are patents which include methods or apparatus with false claims, have never been or ever will work. Usually submitted by universities or companies which are hopelessly trying to boost their sell value by enriching their patent portfolios. Similarly, university professors sometimes use the patenting strategy to provide that kick which makes the graduation of the lower performing PhD students possible, or perhaps why not even justify project money funded by the taxpayer, easy huh? In the end, who doesn't get impressed by a few US/EU patents backing-up a CV. These patents are generally harmless, except for the disturbances caused in the public monetary balance.

The other, and more prominent, type of bullshit patents are the ones filed by patent trolls, whose sole purpose is to milk someone legally. Patent trolling is usually done by individuals (typically well trained lawyers) who issue or buy patents from a bankrupted company and then attempt to enforce patent rights against accused infringers far beyond the patent's actual value or contribution to the prior art. Patent trolls do not really create products or supply any services based upon the patents in question. These are perhaps a bit less of a problem in Europe than in the U.S. because Europe has a fairer loser pays the trial costs regime. But still, this kind of trickery is harmful to all sensible companies who add value to society, and objectively, the end users are the ones who suffer and have always been covering the mess.

Patent trolling is just a product of a lawyer's imagination, does not contribute any value to society, should be regarded as crime; and must be eradicated. That's probably easier said than done, but here's my suggestion for Intellectual Property (IP) handling in an utopian world.

My vision for the patenting system in a dreamland is not to have it in the first place. Eradication of the whole scheme would mean two tings; all harmful trickeries rooting from the law system would vanish and competition would solely be driven by creativity, leading to an even faster technological development. Having no secrets in science and engineering means that to be better one will have no other option but to expand his/her creativity, instead of trying to hide, protect, and waste energy on lawsuits. You might argue that once we lift IP protection some companies will go bankrupt, while others will just cheat by keeping secrecy (which happens nowadays anyway and is totally fine). Well and here is my point, lifting the patenting barriers will only work if, and only if, IPs are always shared and open which isn't particularly easy to enforce.

The last is likely to be the most mountainous issue which is typically encountered in all utopian dreams – you can't re-program people's brains overnight. We need someone like Rotwang, popping out the dungeons of Metropolis to ring that global brain-resonance tuning bell, but this only exists on film reels. Until then, it seems like we'll have to accept things as they are, and stick living in a world stuffed with patent lawyers, bullshit patents and inefficiency.

Date:Thu May 23 06:44:49 GMT 2017


Live demo of the column-parallel ADC testchip

I was lucky enough to have my work accepted at IISW this year. As you may notice the technical program is really crammed. I have 9 minutes to present my work and results, which spans next to impossible. Thus, I decided to create a follow-up video to the presentation which was placed in the last slide. A cheeky way to get away in case the conference presentation gets horribly wrong.

So here's a brief demo, enjoy:

Also, be warnined! A few words are verbally misplaced in my explanations and I don't necessarily mean everything I say in this short clip. Looks rather amateurish, but at least I tried :)

Date:Mon May 16 09:48:03 GMT 2017


Rustic Runes

The year is 2747. The AI apocalypse in 2314 has been long forgotten now and new intelligent biological life forms craving for new information start to emerge. But what's left on Earth now are scattered Runestones with odd mystic symbols. Where could all those runic stamps be coming from?

AI-modified runestone in the county of Östergötland and Linköpings kommun, Sweden

Hehe, I just can't help not pointing out the horrific lens chromatic abberation in my picture...

Date:Mon May 1 12:18:04 GMT 2017


Stitching panoramic die images

The practical definition of dog work

Our lab has a neat brand-new-car-worth Nikon optical microscope... Great!... hehe, except that it's useless... It has a very small field of view together with a very large minimum magnification factor of x10. All of that resulted in a lot of dog work till I acquire a simple panoramic image of a chip.

To distill something good from all the drudgery involved I decided to create a fast-forward timelapse of the whole process. Behold, my artsy creation:

Date:Mon May 3 18:23:11 GMT 2017


A 250 Mbps Sub-LVDS Transmitter

that was supposed to run at 600 Mbps but it didn't

To quickly read out the data from the column-parallel ADCs I had to design a serial interface with multiple high-speed LVDS data links. The possession of a fast driver is indispensable in such situations.

I wanted to avoid reading out the SRAM in parallel with the data conversion process so as to reduce the overall ADC noise as much as possible. Thus, in order to fit within the system's timing requirements this meant that the data readout should occur as fast as possible. I estimated that a serial link of about 500-600 Mbps LVDS should suffice in reading out the 13 kbit on-chip SRAM via 16 serial LVDS pairs. It's worth mentioning that the described here design approach was stuffed-up for about a week, excluding some previous quests for architecture exploration began. So, just as a side note, don't expect reading miracles, the Tx turned out to be just "good enough".


My initial strategy was to try implement a book-design, hence the classic four switch dual current mirror driver, with the typical CMFB scheme using a high-ohmic resistive divider. Turns out because the supply voltage of 1.2V for this process (0.13um) I could not bring the current sources in proper saturation margins without extensive optimizations, for which I had virtually no time. Thus, I ended up using the following mastermind scheme:

The transistors PM4 and PM6 act as switchable current sources and thus should theoretically improve the LVDS voltage swing headroom. These are switched using a passive switched capacitor scheme which uses the NFETs (NM5, NM6, NM7 and NM39) as well as two MIM capacitors. The top plates of the MIM caps are connected to vbias, while their bottom plates are driven by the complementary data line signals d and d_b. This forms the dual complementary switchable current sources, whose gate voltages span between vbias and VDD.

Switched current source

Theoretically this should all work very well, except that the Cgs of the current source dampens the gate switching, imposing the use of a large feedback (pump) capacitor which introduces even more parasitics, hence lowering the bandwidth. The theoretically maximum swing on the gate of PM4 and PM6 is determined by the damping factor:

$$\Delta V = \frac{C_{p}}{C_{p}+C_{fb}} V_{pwr}$$

Following that relation I estimated that a cap of about 550 fF should suffice keeping the damping factor low. However, assuming that one takes care of one of the parasitics and damping around PM4 and PM6, there is a secondary issue and that is the speed of the pump switches and bias generation.

Bias dynamics

Normally when designing a biasing network, one tries to keep currents in the "bias" and "conditioning" branches low as the latter are kind of "wasted" and do no other work than generating a bias voltage. The bias voltage generation and distribution in the current scheme however exhibits fast dynamics and therefore has to either be buffered (without any offset which involves more circuitry) or a simpler solution is cranking-up the current in the bias branch itself so it can charge the large feedback caps before the LVDS eye has settled. The good news here is that after initial settling only a small charge from the capacitor is lost, used for charging and discharging Cgs parasitics of the PM4/6. Assuming that the parasitics steal about 10% of the charge on Cfb we can check if a current of about 128 uA should suffice to quickly replenish Cfb, and whether that Class A solution would be good enough for the purpose.

$$ C_{replenish} = 0.1 \times 500 fF = 50 fF $$ $$ \text{Absolute voltage when fully charged = Vdd - Vt = 700 mV}$$ $$ \text{Deprived voltage is 10 %}\approx 70 mV$$ $$ \text{Quiescent current needed to replenish cap for 200 ps} $$ $$ V = \frac{I \times t}{C} \Leftrightarrow I = \frac{V \times C}{t} = \frac{0.07 [V] \times 500.10^{-15} [F]}{200.10^{-12}[s]} = 125 [\mu A]$$

A current of 125 uA at 1.2 V is not ridiculously high so let it be it, here's the switching response which roughly matches expectations:

The bottom plot shows the gates of the current sources, the mid-plot shows replenishing current, and the top plot indicates LVDS voltage line at 500 Mbps. Hmm, settling doesn't look particularly impressive, but at this point I had to move on as the tapeout deadline was deadly approaching.

Common-mode feedback

Perhaps the CMFB design of this architecture is the most challenging as of the many unknown factors such as the load capacitance and additional parasitics via the compensation Cc MIM cap. Not knowing the exact pad + PCB track capacitance was another factor which could screw-up the stability of the loop. While there could be several different CMFB loop stabilization approaches I decided to use an indirect Miller compensation with a zero cancellation resistor, also limiting the feedforward current. In that scheme to a first order approximation there is a non-dominant pole at:

$$p_{nd} = \frac{g_{m8}}{C_{gs4} + C_{ds0}}$$

Also from $C_{c}$ and $R_{z}$:

$$p_{c} = \frac{-1}{R_{z}(C_{gs4}C_{c}/C_{gs4}+C_{c})}$$

And because of tje nulling resistor you also get a cancellation zero:

$$z_{c} = \frac{-1}{R_{z}C_{c}}$$

Which cancels the output pole and leads the phase. Here is the response of the system obtained via AC SPICE simulations:

The plot shown at the bottom below, provides the loop's response without compensation. As you might note, the compensation does a pretty good job keeping the system at 60deg phase margin, without a significant phase roll-off even beyond the 0 dB magnitude level, implying that it should handle well within PVT corners. Which it does, except for that this simulation does not include parasitics which kind-of ruin the PM of the system (read further).

While there are probably more efficient compensation approaches such as a proper (buffered) indirect compensation with e.g. a source follower which will cancel the feedforward current completely and allow the use of a much smaller compensation capacitor these are more time-consuming for design. Hence the approach here &endash; use a large direct Miller compensation and not care about area. This approach, as it is seen later in text comes at a bandwidth cost due to the exhibited parasitics on the bottom plate of the MIM cap.

The common mode range is dictated by the Vds saturation margins of the switchable current source, as well as the input range of the CMFB error amplifier. In the current design it was estimated that it could span between 0.6 to 0.9 Volts. Most modern FPGA receivers can detect such LVDS common-mode voltages. The LVDS span was tuneable between 50 mV to 300 mV peak-to-peak.

Load model

I've used the ESD in the pads in combination with 2 pF (extra) internal capacitance, as well as about 6 pF for the PCB track and FPGA pads. As we shall see from the measurements, these values might have been a bit on the optimistic side. Here's the used load model for design purposes:

Doubling the far-end capacitance might have been a better idea.


So! It underperformed greatly, but luckily didn't fail! Was designed for 600 Mbps but achieved only 250 Mbps. So what happened? Here's a few theories:

— inefficient architecture + design: the Class A bias driver used by the switched current source "pump" capacitors is a rather inefficient approach. A better implementation should involve the use of a proper switched-cap buffer. The latter shall significantly improve current source switching and hence settling.

— last minute change: Yes, these are always risky! Dummy fill was deliberately blocked inside the LVDS core due to parasitics effect worries. However, Dongbu returned the layout with concerns for overetching of a few logic gates inside the Tx. Rushing to submit a new GDSII, I removed all dummy fill blockers under the feedback caps without running parasitic extraction. This added a bit of parasitics between the bottom plate of the pump capacitors and the substrate. Apparently this extra capacitance greatly reduced the bandwidth of the switched current sources. Oops!

As a result, due to asymmetric loading of the control lines, the common-mode of the P and N lines drifted, thus closing the eye of the data line. In principle this shouldn't have mattered as the LVDS Rx in the FPGA is AC-coupled anyway, but the FPGA deserialization block just couldn't lock the training pattern at the designed speeds...

— more parasitics: while I didn't really observe any CMFB instability, some common-mode ringing was exhibited at high speeds which is probably the reason why the ISERDES failed to lock. The loop's PM has been reduced by the extra parasitics on the miller compensation:

— even more undervalued parasitics: to reduce the body effect on the switches and help improve substrate noise isolation I decided to put the N-FET switches under a local DNW and bridge the bulk-source. Doing this adds an extra reverse-biased junction at the tail bias node, which lies exactly within the feedback loop, reducing phase margin. I knew about that effect, so I decided to run some ballpark junction capacitance estimations by fetching some parameters from the SPICE models and using the typical square law model:

$$C_{tot} = C_{j} + C_{jsw} = ((A_{d}*C_{j})/(1+V/V_{o}))^{mj} + \\ + ((P_{d}*C_{jsw})/(1+V/V_{o}))^{mjsw}$$

where $mj$ and $mjsw$ are process-specific coefficients which along with $C_{j}$ and $C_{jsw}$ I fetched from the SPICE models. The latter lead to the following result:

Computing 5 femto Farads for a 10 x 10 um diode? I must have lost my mind, this model is very very wrong, and the actual value must be larger. At that point however the deadline was approacting so quickly that I simply had to keep ignoring things. Here's some waveforms run at 100 Mbps using a 100 ohm termination resistance built-in to the FPGA receiver:

Physical overview:

The layout is rather huge, as it may be noted from the layot diagram, I removed all active circuitry under the pump and miller caps so as to reduce parasitics. However, the last minute dummy generation sadly filled all of the area below.

Nevertheless, the driver does just fine up to 250 Mbps and allows me to run and evaluate the ADCs in full speed, although, it would have been nicer to not have to convert and readout at the same time. Hope this mixture of thoughts helps someone who decides to try this architecture.

Date:Thu Apr 06 14:02:09 GMT 2017