The TransistoriZed logo should be here! But... curses! Your browser does not support SVG!

homelist of postsdocsγ ~> e-about & FAQ


Live demo of the column-parallel ADC testchip

I was lucky enough to have my work accepted at IISW this year. As you may notice the technical program is really crammed. I have 9 minutes to present my work and results, which spans next to impossible. Thus, I decided to create a follow-up video to the presentation which was placed in the last slide. A cheeky way to get away in case the conference presentation gets horribly wrong.

So here's a brief demo, enjoy:

Also, be warnined! A few words are verbally misplaced in my explanations and I don't necessarily mean everything I say in this short clip. Looks rather amateurish, but at least I tried :)

Date:Mon May 16 09:48:03 GMT 2017

LEAVE A COMMENT | SEE OTHER COMMENTS



Rustic Runes

The year is 2747. The AI apocalypse in 2314 has been long forgotten now and new intelligent biological life forms craving for new information start to emerge. But what's left on Earth now are scattered Runestones with odd mystic symbols. Where could all those runic stamps be coming from?

AI-modified runestone in the county of Östergötland and Linköpings kommun, Sweden

Hehe, I just can't help not pointing out the horrific lens chromatic abberation in my picture...

Date:Mon May 1 12:18:04 GMT 2017

LEAVE A COMMENT | SEE OTHER COMMENTS



Stitching panoramic die images

The practical definition of dog work

Our lab has a neat brand-new-car-worth Nikon optical microscope... Great!... hehe, except that it's useless... It has a very small field of view together with a very large minimum magnification factor of x10. All of that resulted in a lot of dog work till I acquire a simple panoramic image of a chip.

To distill something good from all the drudgery involved I decided to create a fast-forward timelapse of the whole process. Behold, my artsy creation:

Date:Mon May 3 18:23:11 GMT 2017

LEAVE A COMMENT | SEE OTHER COMMENTS



A 250 Mbps Sub-LVDS Transmitter

that was supposed to run at 600 Mbps but it didn't

To quickly read out the data from the column-parallel ADCs I had to design a serial interface with multiple high-speed LVDS data links. The possession of a fast driver is indispensable in such situations.

I wanted to avoid reading out the SRAM in parallel with the data conversion process so as to reduce the overall ADC noise as much as possible. Thus, in order to fit within the system's timing requirements this meant that the data readout should occur as fast as possible. I estimated that a serial link of about 500-600 Mbps LVDS should suffice in reading out the 13 kbit on-chip SRAM via 16 serial LVDS pairs. It's worth mentioning that the described here design approach was stuffed-up for about a week, excluding some previous quests for architecture exploration began. So, just as a side note, don't expect reading miracles, the Tx turned out to be just "good enough".

Architecture

My initial strategy was to try implement a book-design, hence the classic four switch dual current mirror driver, with the typical CMFB scheme using a high-ohmic resistive divider. Turns out because the supply voltage of 1.2V for this process (0.13um) I could not bring the current sources in proper saturation margins without extensive optimizations, for which I had virtually no time. Thus, I ended up using the following mastermind scheme:

The transistors PM4 and PM6 act as switchable current sources and thus should theoretically improve the LVDS voltage swing headroom. These are switched using a passive switched capacitor scheme which uses the NFETs (NM5, NM6, NM7 and NM39) as well as two MIM capacitors. The top plates of the MIM caps are connected to vbias, while their bottom plates are driven by the complementary data line signals d and d_b. This forms the dual complementary switchable current sources, whose gate voltages span between vbias and VDD.

Switched current source

Theoretically this should all work very well, except that the Cgs of the current source dampens the gate switching, imposing the use of a large feedback (pump) capacitor which introduces even more parasitics, hence lowering the bandwidth. The theoretically maximum swing on the gate of PM4 and PM6 is determined by the damping factor:

$$\Delta V = \frac{C_{p}}{C_{p}+C_{fb}} V_{pwr}$$

Following that relation I estimated that a cap of about 550 fF should suffice keeping the damping factor low. However, assuming that one takes care of one of the parasitics and damping around PM4 and PM6, there is a secondary issue and that is the speed of the pump switches and bias generation.

Bias dynamics

Normally when designing a biasing network, one tries to keep currents in the "bias" and "conditioning" branches low as the latter are kind of "wasted" and do no other work than generating a bias voltage. The bias voltage generation and distribution in the current scheme however exhibits fast dynamics and therefore has to either be buffered (without any offset which involves more circuitry) or a simpler solution is cranking-up the current in the bias branch itself so it can charge the large feedback caps before the LVDS eye has settled. The good news here is that after initial settling only a small charge from the capacitor is lost, used for charging and discharging Cgs parasitics of the PM4/6. Assuming that the parasitics steal about 10% of the charge on Cfb we can check if a current of about 128 uA should suffice to quickly replenish Cfb, and whether that Class A solution would be good enough for the purpose.

$$ C_{replenish} = 0.1 \times 500 fF = 50 fF $$ $$ \text{Absolute voltage when fully charged = Vdd - Vt = 700 mV}$$ $$ \text{Deprived voltage is 10 %}\approx 70 mV$$ $$ \text{Quiescent current needed to replenish cap for 200 ps} $$ $$ V = \frac{I \times t}{C} \Leftrightarrow I = \frac{V \times C}{t} = \frac{0.07 [V] \times 500.10^{-15} [F]}{200.10^{-12}[s]} = 125 [\mu A]$$

A current of 125 uA at 1.2 V is not ridiculously high so let it be it, here's the switching response which roughly matches expectations:

The bottom plot shows the gates of the current sources, the mid-plot shows replenishing current, and the top plot indicates LVDS voltage line at 500 Mbps. Hmm, settling doesn't look particularly impressive, but at this point I had to move on as the tapeout deadline was deadly approaching.

Common-mode feedback

Perhaps the CMFB design of this architecture is the most challenging as of the many unknown factors such as the load capacitance and additional parasitics via the compensation Cc MIM cap. Not knowing the exact pad + PCB track capacitance was another factor which could screw-up the stability of the loop. While there could be several different CMFB loop stabilization approaches I decided to use an indirect Miller compensation with a zero cancellation resistor, also limiting the feedforward current. In that scheme to a first order approximation there is a non-dominant pole at:

$$p_{nd} = \frac{g_{m8}}{C_{gs4} + C_{ds0}}$$

Also from $C_{c}$ and $R_{z}$:

$$p_{c} = \frac{-1}{R_{z}(C_{gs4}C_{c}/C_{gs4}+C_{c})}$$

And because of tje nulling resistor you also get a cancellation zero:

$$z_{c} = \frac{-1}{R_{z}C_{c}}$$

Which cancels the output pole and leads the phase. Here is the response of the system obtained via AC SPICE simulations:

The plot shown at the bottom below, provides the loop's response without compensation. As you might note, the compensation does a pretty good job keeping the system at 60deg phase margin, without a significant phase roll-off even beyond the 0 dB magnitude level, implying that it should handle well within PVT corners. Which it does, except for that this simulation does not include parasitics which kind-of ruin the PM of the system (read further).

While there are probably more efficient compensation approaches such as a proper (buffered) indirect compensation with e.g. a source follower which will cancel the feedforward current completely and allow the use of a much smaller compensation capacitor these are more time-consuming for design. Hence the approach here &endash; use a large direct Miller compensation and not care about area. This approach, as it is seen later in text comes at a bandwidth cost due to the exhibited parasitics on the bottom plate of the MIM cap.

The common mode range is dictated by the Vds saturation margins of the switchable current source, as well as the input range of the CMFB error amplifier. In the current design it was estimated that it could span between 0.6 to 0.9 Volts. Most modern FPGA receivers can detect such LVDS common-mode voltages. The LVDS span was tuneable between 50 mV to 300 mV peak-to-peak.

Load model

I've used the ESD in the pads in combination with 2 pF (extra) internal capacitance, as well as about 6 pF for the PCB track and FPGA pads. As we shall see from the measurements, these values might have been a bit on the optimistic side. Here's the used load model for design purposes:

Doubling the far-end capacitance might have been a better idea.

Measurements

So! It underperformed greatly, but luckily didn't fail! Was designed for 600 Mbps but achieved only 250 Mbps. So what happened? Here's a few theories:

— inefficient architecture + design: the Class A bias driver used by the switched current source "pump" capacitors is a rather inefficient approach. A better implementation should involve the use of a proper switched-cap buffer. The latter shall significantly improve current source switching and hence settling.

— last minute change: Yes, these are always risky! Dummy fill was deliberately blocked inside the LVDS core due to parasitics effect worries. However, Dongbu returned the layout with concerns for overetching of a few logic gates inside the Tx. Rushing to submit a new GDSII, I removed all dummy fill blockers under the feedback caps without running parasitic extraction. This added a bit of parasitics between the bottom plate of the pump capacitors and the substrate. Apparently this extra capacitance greatly reduced the bandwidth of the switched current sources. Oops!

As a result, due to asymmetric loading of the control lines, the common-mode of the P and N lines drifted, thus closing the eye of the data line. In principle this shouldn't have mattered as the LVDS Rx in the FPGA is AC-coupled anyway, but the FPGA deserialization block just couldn't lock the training pattern at the designed speeds...

— more parasitics: while I didn't really observe any CMFB instability, some common-mode ringing was exhibited at high speeds which is probably the reason why the ISERDES failed to lock. The loop's PM has been reduced by the extra parasitics on the miller compensation:

— even more undervalued parasitics: to reduce the body effect on the switches and help improve substrate noise isolation I decided to put the N-FET switches under a local DNW and bridge the bulk-source. Doing this adds an extra reverse-biased junction at the tail bias node, which lies exactly within the feedback loop, reducing phase margin. I knew about that effect, so I decided to run some ballpark junction capacitance estimations by fetching some parameters from the SPICE models and using the typical square law model:

$$C_{tot} = C_{j} + C_{jsw} = ((A_{d}*C_{j})/(1+V/V_{o}))^{mj} + \\ + ((P_{d}*C_{jsw})/(1+V/V_{o}))^{mjsw}$$

where $mj$ and $mjsw$ are process-specific coefficients which along with $C_{j}$ and $C_{jsw}$ I fetched from the SPICE models. The latter lead to the following result:

Computing 5 femto Farads for a 10 x 10 um diode? I must have lost my mind, this model is very very wrong, and the actual value must be larger. At that point however the deadline was approacting so quickly that I simply had to keep ignoring things. Here's some waveforms run at 100 Mbps using a 100 ohm termination resistance built-in to the FPGA receiver:

Physical overview:

The layout is rather huge, as it may be noted from the layot diagram, I removed all active circuitry under the pump and miller caps so as to reduce parasitics. However, the last minute dummy generation sadly filled all of the area below.

Nevertheless, the driver does just fine up to 250 Mbps and allows me to run and evaluate the ADCs in full speed, although, it would have been nicer to not have to convert and readout at the same time. Hope this mixture of thoughts helps someone who decides to try this architecture.

Date:Thu Apr 06 14:02:09 GMT 2017

LEAVE A COMMENT | SEE OTHER COMMENTS



Six questions you should ask your future employer

with an emphasis on VLSI, but probably applicable to any other eng-sci field

If one hasn't been showered with job offers already, chances are high he has had to send that sacred job application at some point. Here's my list of six questions one should (must?) ask along with the necessary work-related topics when seeking for a new job.

Work and work environment — As an employee you'll most likely spend 9+ hours daily at the job and you must be happy with what you do. Make sure you go check out the working conditions before accepting an offer, even if the office turns out to be on the dark side of the moon – do that! You should never accept blind offers based solely on telephonic interviews, unless you are super desperate and have absolutely no other choice. Not knowing what to expect when on your first day at the job might leave you disappointed. Never judge your workplace by its name and/or reputation.

Meet coworkers — Another important reason why you should visit your prospective employer is to meet your coworkers. Make sure you meet every one aboard the team. Typically, good groups will set you up on private interviews with future colleagues, which is an excellent opportunity for you (and them) to see whether you'll like each other. Are they...eeergh... idiots, slackers? Yes? Run fast!

Design practices — The individual private meetings are an excellent opportunity for you to ask more about the actual job, as well as getting a glimpse of their design practices. Are they trying to save from licensing costs by torturing their engineers do pointless heroisms? Just so the corporate boss can show off and say he's a great manager who just saved the company three bucks on behalf of your nervous system. This is also the time for you to ask counter-questions so you can see whether you should run fast.

The future — The group's future directly impacts you so don't hold your breath. However, be cautious, as there's a chance you'll hear a ton of lies on how pink the situation at the company currently is, expansion, profit, growth... You might also get a counter-question about your future vision (or whatever). Be careful what you're answering as such questions might have nothing to do with your job, but your employer might be just fishing and checking out what's under your hood. Here's a classic (in case you're applying for a jr. position) – "In the future would you like to take the management path or would you rather stick to R&D?" Remember what you're applying for, if you're asked anything similar be sure it's a trap.

Know your value — It really depends on your skills as well as the country and company you are applying at. Pay-related discussions might mostly be a trap for young players, but generally speaking the VLSI industry is pretty uniform and honest. If you are applying in the US or country with capitalistic views there's a high chance you get an offer that barely reaches the industry's mean level of pay. Don't be shy asking for more, but be careful – know your value, you cannot really ask 20x more than what you're offered initially... well, except for some special cases...

Backgrounds — Do your homework. Check the background of the group you're applying to. Ask former employees, be a detective, stalker, or however you want to call it. Glassdoor is one starting point.

Make sure you explore your options thoroughly!

Date:Thu Apr 06 14:02:09 GMT 2017

LEAVE A COMMENT | SEE OTHER COMMENTS