On a recent trip to Germany, I was a passenger in a car with my colleague Tom De Schutter on the German Autobahn. We had just landed in Frankfurt and were driving to Aachen, Germany. The anticipation of the potential speed at which we would drive created both excitement and anxiety in me. What I learned about driving on the autobahn was eye opening with how Germany allows for speed, but also enforces rules of the road to insure safer driving. You need to have a decent car, a driver who has nerves of steel and is trained with the rules of the road. It does help that the roads are in good shape and you are driving with others who have been trained from childhood to drive on the autobahn. I don’t claim to be an expert and won’t go into the details of the rules but will point you to this article on how stuff works on the Autobahn. http://auto.howstuffworks.com/autobahn2.htm It did get me thinking about the parallels of driving on the Autobahn and how we are driving faster and faster in the tech world with getting products to market. The whole product life cycle is shrinking while the complexity is increasing. Lifecycles of 12 months or less are quite common in mobile, consumer and even automotive and industrial cycles are shrinking. So how do you go fast and still remain sane? You clearly cannot do the same thing you did before and just attempt to do it faster. You need to systematically consider all aspects of the process to adjust for this speed. Going fast without training, rules of the road, better vehicles and roads can be disastrous. Changes are needed. Using yet another comparison, innovation in product without innovation in process and methodology, team dynamics and organization is like driving a Maserati without any training in driving a fast car. So why would you sign up for developing your new product in a much shorter time with the same process as before? Everything from social dynamics – how teams work together, the HW and SW development process, the testing process, the training process and rollout to the field and customers are all fair game for examination. When I wrote this article for Tech Design Forum, I had this in mind. De-risking a project and working with speed and yet quality is about changing the way we develop. – Not just using PowerPoint, spreadsheets and discussions to make design decisions but actually simulating the products and doing what-if analysis. – Not waiting till the end of the project to do 50% of the solution development i.e. Software. But instead, starting early, incrementally developing and testing against targets available in each phase. For e.g. starting with models and moving to FPGA boards and sample boards and reserving only the final tests for the final boards. Why risk doing all your development on the final boards? Why not limit use of real hardware for the final test phase? – Even HW development can be agile, when you have multiple feedback iterations from running actual SW on the models even before RTL is created or committed to. Why depend on SW to work around any HW quirks? Why not change HW design early so both can be optimal? – Even when all SW teams are not available to work on the project, start with the sub-groups that are available. Enable them with portion of the prototype that they need to get their work done. – Why not do what if analysis with models so you can fine tune performance and power utilization? Why wait till you get complaints from customers to make the adjustments? The more we wait, the more expensive changes become. – Multicore designs are one way to get more horsepower into designs. But if you have horsepower without knowing how to use it can be a huge waste. Developing software for multicore designs is still an art and having more runways to do and test it with full visibility into the behavior of different cores is critical to its success. More than anything, as I get older and smarter I realize that I don’t want to stress my mind and work whenever I don’t need to. Gone are the days when I crammed for an exam the night before and stayed up all night. I plan and spread out my work over the days I have to minimize the risk, stress and wear and tear. It has resulted in better quality work and a happier life. I am also happy to say that I survived the autobahn and have a healthy respect for speed and that it can be done safely with some planning. Wish all of you speed, safety, happiness and a stress-free holiday season.
What does the Sagrada Familia and Embedded Linux have in common?
Posted in Embedded Software | Comments Off on What does the Sagrada Familia and Embedded Linux have in common?
Watching the Olympics this past summer was quite exciting. I enjoyed seeing athletes at the peak of their performance and multiple records broken in many sports. What we don’t see is the years of practice and work behind this excellence. These athletes work at the technique, strength, endurance and mental attitude of winning. To me, this is no different than the work that goes on behind the scenes of a new chip introduction or for that matter, any new product introduction.
Finally, nine months after the next-generation SoC project was kicked off, the first prototype board has finally arrived! There are just six months left to get Android and Linux up and running. Since Android should take full advantage of the latest hardware additions, let’s make sure we get it ported as quickly as possible. Unfortunately, it is not that easy. Before you can care about porting your OS and developing the drivers for your special hardware, you have to deal with the boring, but necessary initialization of the hardware. This is typically done by a so-called boot monitor. A boot monitor is a software program in ROM that gets launched after pressing the power-on button on your hardware. Even though the boot monitor is not a large piece of software (compared to the OS etc.), it provides complex functions and interacts with many hardware peripherals. As an example, the classic minimal functions of a boot loader are given below:
Debugging software by adding printf statements in the code is not considered the cleanest and most advanced debugging approach, but when you are searching for the root cause of a problem you often look to the debugging method you are most familiar with and can apply easily. The hurdle of setting up a complex debug or trace tool is counterproductive when dealing with schedule commitments and printf is the fallback method which almost always works as long as you are able to modify and rebuild the software source code. I was recently talking to a software engineer who was working on driver development for a new WiFi chip. When I asked how he is debugging the driver, he answered, “I am doing printk debugging” (printk: debug messages inside the Linux kernel). The reason behind his decision was to avoid the complicated steps involved in setting up a kernel debugger such as kdb. Luckily, the entire Linux kernel is full of printk messages; just like each and every module of the Android source code is instrumented with “log” messages. The beauty of this kind of debugging is that it can be used to expose the semantics of the software and identify what the software is actually doing. Instead of tracing functions like foo and bar you get useful information such as, “Probing device returned false!” message. It becomes immediately clear what is going on, even if the code has been written by somebody else. However, there are limitations, drawbacks and side-effects to consider:
Developing embedded software often requires a physical target to run software for the purpose of validation and debug. As is often the case, the exact hardware may not exist yet. The software developer is faced with a few choices: explore using models, use a previous generation board or consider another solution where the exact hardware is available. Most software developers generally try to find a solution that is “good enough”, yet pragmatic, which serves their time and cost requirements. For example, in order to work on the Linux scheduler for big.LITTLE processing, software developers used “old” hardware such as those presented at the recent Linaro Connect event. Software developers gave an example of this exact scenario and how they leveraged “old” hardware to complete their Linux scheduler development for big.LITTLE processing. Here, the performance asymmetry of an ARM Cortex-A15/A7 MPCore big.LITTLE processing system is emulated using an off-the-shelf Cortex-A9 MPCore board In the case of big.LITTLE processing, two CPUs with different performance characteristics are combined together, while the Cortex-A9MPCore has common CPUs with identical performance. To mimic big.LITTLE processing on the Cortex-A9MPCore, asymmetry is emulated by running a so called “cycle stealer” software process on one of the Cortex-A9 CPUs, resulting in reduced processing bandwidth on the second CPU. This solution creates a set up that mirrors the expected big.LITTLE processing capabilities and the software under test takes longer to run on one CPU, than it takes for the same software to run on the second CPU. Is it cycle accurate? For sure not, but this is certainly a pragmatic, “good enough” solution to start optimizing the Linux kernel scheduler.
big.LITTLE processing refers to the concept of combining a high performance ARM Cortex™-A15 MPCore™ processor along with an energy efficient Cortex-A7 processor. There were two primary use models recently introduced by ARM for big.LITTLE processing: task migration and MP. The big.LITTLE task migration use model is where the applications migrate between one cluster and another based on some criteria. The big.LITTLE MP use model, on the other hand, allows both CPUs to run simultaneously. Determining which software should run on the Cortex-A15 and which should run on the Cortex-A7 is likely to be decided at runtime through a power-aware scheduler in the operating system kernel.
Posted in Uncategorized | Comments Off on A Closer Look at Software Development for ARM’s big.LITTLE Processing – Part II
How to win over the embedded software developer, their customer and their boss.
In the last month, I had the opportunity to get some hands-on experience with hardware virtualization and hypervisors. My knowledge so far on this has been mainly limited to what I could read about it and what other people are saying about it. However, the PowerPoint slides I’ve seen leave a lot of white fog between the bullet items. This didn’t make me feel very comfortable talking about this topic myself; but, there was no escape. Hypervisors play an increasingly important role for system designers in context of supporting multiple guest operating systems on the same device, or taking advantage of ARM®’s new big.LITTLETM processing. The fog is not all gone, but let me provide you some insight on what I found out. As a disclaimer, I’m not going to (and I cannot) write an expert almanac about all the aspects of virtualization covering Xen, VMWare, etc. Instead, I’m going to focus on my personal experience that I believe will be relevant to you as well. This post is the starting point for a series on this topic in this blog.
Posted in ARM, Embedded Software, Energy and Performance, Hypervisor, Power Management, Virtual Prototypes, Virtualization | Comments Off on A Closer Look at Software Development for ARM’s big.LITTLE Processing – Part I
Transaction-level models are the main building blocks of virtual prototypes, which are used for early software development. In my last blog post, I briefly introduced the different kinds of software tasks and the implications for models. Today, I want to talk about the modeling requirements for early SoC bring up. As I mentioned, understanding the software requirements correctly provides two clear benefits: 1) it makes modeling easier through a more focused application and 2) it increases the value for the software developer through more tailored modeling capabilities such as debug features.