Using Agile development practices for kernel development A.K.A - Bringing sanity to chaos Chase Maupin, system integration manager for the Linux Core Product Development (LCPD) team 1
Agenda • Agile Manifesto • Meet LCPD - Charter and team • What’s the problem? • Mmhmm, you can fix it right? • Let’s make sausage • Would you do it again? • Continuous improvement 2
Agile Manifesto 3
Agile Manifesto • We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: – Individuals and interactions over processes and tools – Working software over comprehensive documentation – Customer collaboration over contract negotiation – Responding to change over following a plan • That is, while there is value in the items on the right, we value the items on the left more • http://agilemanifesto.org/ 4
Meet LCPD 5
Where in the world is LCPD? • LCPD is spread out across the world in six time zones LCPD Functional – West Coast US Teams – Central US Baseport – East Coast US Power & Thermal Connectivity – Germany Audio & Display – Finland System Test System Integration – India 6
What would you say…you do here? • LCPD charter – Creation of high quality, scalable Linux solutions for processors through upstream development of uboot, the Linux kernel, tool chain and file system – Insure maximum software reuse and device entitlement by working with silicon design teams in providing feedback and requirements on new SoC architectures • Translation from manager to ‘techie’: – Work with the upstream communities for our software components to ensure that TI devices are supported in the mainline and work without additional patches – Ensure that we are addressing feedback from the community and regressions in the mainline to ensure continued quality – Work with our design teams to make sure simple design decisions don’t have ripple effects through the software 7
What’s the problem? 8
It’s a big world after all • As mentioned previously we have team members around the world in six different time zones • Furthermore within each functional area we have team members spread around the world • This makes co-ordination difficult among team members due to limited overlapping work time • IRC helps some but we needed more collaboration 9
Everyone wants a piece • LCPD services multiple customers each with: – Their own set of care about devices – Their own priorities and release schedules – Their own set of end customers with requirements and issues • LCPD engineers care about the IP first, not the device – Develop the feature or fix the issue for the IP on all devices – This means that teams are not organized by device (i.e. a kernel team per device) but instead by IP and functional areas • This leads to the same developers being requested to develop features for multiple customers and a need to have a single voice prioritizing and directing these efforts 10
A balancing act • The LCPD charter is to develop support for TI devices upstream. This is how we ensure sustainable, quality software development • The community provides us feedback and requirements as part of this which requires effort from TI • This effort has to be balanced along side the requirements from our internal customers that LCPD serves • Furthermore as merge windows approach, the priority of community tasks increases since missing a merge window means carrying patches out of tree for months 11
Square peg, meet round hole • Many of our developers specialize in a particular IP or kernel subsystem • Experts require less ramp time which improves efficiency • This efficiency comes at the cost of cross functionality – We do not view developers as interchangeable cogs – Rather we would like to encourage developers to branch out into other interest areas 12
Sometimes the molehill IS a mountain • Support for TI devices HAD NOT been pushed upstream and instead consisted of thousands of patches on old kernel revisions • Moving these patches upstream while also developing support for new devices and IP was overwhelming • We needed a way to keep track of the mountain but only worry about one molehill at a time – Currently our focus devices of AM335x, AM437x, OMAP5, and DRA7xx all boot directly from the mainline kernel with additional driver support being added 13
Mmhmm, you can fix it right? 14
Scrum, it’s not as dirty as it sounds • LCPD chose Scrum as the Agile process to help address our problems • Having a shared backlog prioritized between customers allowed easier communication of trade-offs and visibility into the team shopload • Giving developers focused time (a sprint) to work on items helped ease the chaos of fire fighting and priority churn – Reduced the shell-shock as well. Looking back we had moved the mountain one boulder at a time • Making upstreaming part of the process kept focus on our charter 15
Make sure you have the right tool • Needed an online tool which can be accessed both inside and outside of our firewall – This is particularly helpful for our remote/home based developers • Needed a tool that allows all of LCPD to share a backlog while still grouping development tasks for functional teams • Needed a tool that does release planning, sprint tracking, etc all from one tool • Needed something that integrates with bug trackers like CQ to allow us to track bugs in a unified backlog • Wanted to give visibility to our customers of our backlog, priorities, and progress – This allows for them to pull information, rather than us having to push contant updates when requested • LCPD chose VersionOne (V1), an Agile SW development management tool • NOTE: There are many other good tools available to chose from, this was just the one we picked 16
Let’s make sausage 17
Sometimes I feel like you are a world away • As mentioned in the LCPD introduction our team is scattered around the world • Furthermore, the members of the different functional teams are scattered (limited co- location) • There is very little time overlap to allow for scrum meetings at a functional team level • Scrum teams are organized first by time zone, then by functional area • Backlog refinement meetings are held weekly at the functional team level – The functional team reviews that domains backlog at that time – People align on which team members plan to take which backlog item 18
It’s done when WE say it’s done • LCPD shares a definition of when something is done, which reduces confusion • A development item is done when: – The code has been written – The code has been validated (system test or developer) – Where appropriate the patches have been submitted upstream for review • In this manner the upstreaming of work is part of our development flow • A defect item is done when: – The code has been written – The code has been merged into the production tree – Where appropriate the patches have been submitted upstream for review – System test has validated the fix in the production tree • The main difference is that system test operates against the production tree. Defects found there are checked for applicability to the latest mainline and if so fixed for mainline and then backported to production tree 19
I want it NOW • Support escalations can happen at any time • Customers generally don’t care if you are in the middle of a sprint • How do you plan a sprint for two weeks and still be responsive to customers? – Many scrum practitioners face this same problem so no need to invent anything new • Allocate overhead in each sprint for the typical customer support load – Usually about 25% – This time lets customers see progress being made – For simple issues this is likely enough – For complex issues this is enough to replicate the problem and plan more time in the next sprint • The Kernel Community is treated as a critical “customer”. This gives us time to respond to feedback • If no customer support comes in we can opportunistically work on something else from the backlog, assist other team members, or do code clean-up, etc 20
How long will it take to upstream this? • Upstreaming is a process that takes time. • It is not a process that can always be predicted • So how do you handle upstreaming in Scrum with fixed time boxes and an indeterminite process? • Back to LCPD definition of Done we consider an item “done” when we have submitted it upstream for review – Small feedback goes into the “customer support” overhead bucket – Significant feedback gets a new story allocated to address the feedback and a new submission. This is given critical priority • This cycle iterates until the work is upstream • If we expect feedback on a series we plan for it in the next sprint. i.e. an RFC will likely have feedback that needs to be addressed 21
It’s bigger than just you • As active community developers some LCPD team members also have maintainership responsibilities in the broader community • In our Scrum implementation we handle this by creating recurring stories representing the maintainership time and tasks • The maintainers pull these stories into every sprint, ensuring that they have enough time reserved to take care of not just TI, but their community responsibilities as well 22
Recommend
More recommend