Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Born on the 4th of July

Star of India, San Diego, CA

Star of India, San Diego, CA

I wasn’t born on the 4th of July, but my wife was.  This week we have a perfect storm of activities that have conspired to postpone the second installment of the Re-read of The Mythical Man-Month: Essays on Software Engineering this Saturday. Birthday day for my wife, national holiday and family reunion are keeping me away from the keyboard today.  So please pardon the interruption!


Categories: Process Management

Happy 4th of July

Herding Cats - Glen Alleman - 4 hours 35 min ago

Semper Fidelis to all my colleagues and friends. Wait till minute 3:57 

Categories: Project Management

Agile Program Manager

Individually, each boy is a project. Together they're a program to be managed.

Individually, each boy is a project. Together they’re a program to be managed.

Scaling Agile project management to large, complex endeavors requires an Agile Program Manager to address the big picture coordination of programs.  Program management is the discipline of coordinating and managing large efforts comprised of a number of parallel and related projects. Scrum leverages a concept called the Scrum of Scrums to perform many of the activities need for program management.  Agile Program Management is not just repurposed project management or a part-time job for a Scrum Master.

Agile Program Managers coordinate and track expectations across all projects under the umbrella of the program, whether the projects are using Agile or not. Coordination includes activities like identifying and tracking dependencies, tracking risks and issues and communication. Coordination of the larger program generally requires developing a portfolio of moving parts at the epic or function level across all of the related projects (epics are large user stories that represent large concepts that will be broken down later). Agile Program Managers layer each project‚Äôs release plans on top of the program portfolio to provide a platform for coordinated release planning. Techniques like¬†Kanban¬†can be used for tracking and visualizing the portfolio.¬† Visualization show how the epics or functions are progressing as they are developed and staged for delivery to the program’s customers.

Facilitating communication is one the roles of an Agile Program Managers. The Scrum of Scrums is the primary vehicle for ensuring communication.  The Scum of Scrums is a meeting of the all of the directly responsible individuals (DRIs) from each team in the program. The DRI has the responsibility to act as the conduit of information for his or her team to the Agile Program Manager and other DRIs. The DRI raises issues, risks, concerns and needs. In short, the DRI communicates to the team and the Scrum of Scrums. The Scrum of Scrums is best as a daily meeting of the DRIs chaired by the Agile Program Manager, however the frequency can be tailored to meet the program’s needs.  A pattern I have seen used to minimize overhead is varying the frequency of the Scrum of Scrums based on project risk.

Another set of activities that generally fall to the Agile Program Manager is the development and communication of program status information. Chairing high-level status meetings, such as those with sponsor or other guidance groups, is a natural extension of the role. However this requires the Agile Program Manager to act as a conduit of information by transferring knowledge from the Scrum of Scrums to the sponsors and back again. Any problem with information flow can potentially cause bad decisions and will affect the program.

It is important to recognize that Agile Program Management is more than a specialization within the realm of project management or a side job a Scrum Master can do in his or her spare time.  Agile Program Managers need to be well versed in both Agile techniques and in standard program management techniques because the Agile Program Manager is a hybrid from both camps. Agile Program Managers build the big picture view that a portfolio view of all of the related projects will deliver. They also must facilitate communication via the Scrum of Scrums and standard program status vehicles.  The Agile Program Manager many times must straddle the line between both the Agile and waterfall worlds.


Categories: Process Management

R: Calculating the difference between ordered factor variables

Mark Needham - Thu, 07/02/2015 - 23:55

In my continued exploration of Wimbledon data I wanted to work out whether a player had done as well as their seeding suggested they should.

I therefore wanted to work out the difference between the round they reached and the round they were expected to reach. A ’round’ in the dataset is an ordered factor variable.

These are all the possible values:

rounds = c("Did not enter", "Round of 128", "Round of 64", "Round of 32", "Round of 16", "Quarter-Finals", "Semi-Finals", "Finals", "Winner")

And if we want to factorise a couple of strings into this factor we would do it like this:

round = factor("Finals", levels = rounds, ordered = TRUE)
expected = factor("Winner", levels = rounds, ordered = TRUE)  
 
> round
[1] Finals
9 Levels: Did not enter < Round of 128 < Round of 64 < Round of 32 < Round of 16 < Quarter-Finals < ... < Winner
 
> expected
[1] Winner
9 Levels: Did not enter < Round of 128 < Round of 64 < Round of 32 < Round of 16 < Quarter-Finals < ... < Winner

In this case the difference between the actual round and expected round should be -1 – the player was expected to win the tournament but lost in the final. We can calculate that differnce by calling the unclass function on each variable:

 
> unclass(round) - unclass(expected)
[1] -1
attr(,"levels")
[1] "Did not enter"  "Round of 128"   "Round of 64"    "Round of 32"    "Round of 16"    "Quarter-Finals"
[7] "Semi-Finals"    "Finals"         "Winner"

That still seems to have some remnants of the factor variable so to get rid of that we can cast it to a numeric value:

> as.numeric(unclass(round) - unclass(expected))
[1] -1

And that’s it! We can now go and apply this calculation to all seeds to see how they got on.

Categories: Programming

Game Performance: Data-Oriented Programming

Android Developers Blog - Thu, 07/02/2015 - 22:12

Posted by Shanee Nishry, Game Developer Advocate

To improve game performance, we’d like to highlight a programming paradigm that will help you maximize your CPU potential, make your game more efficient, and code smarter.

Before we get into detail of data-oriented programming, let’s explain the problems it solves and common pitfalls for programmers.

Memory

The first thing a programmer must understand is that memory is slow and the way you code affects how efficiently it is utilized. Inefficient memory layout and order of operations forces the CPU idle waiting for memory so it can proceed doing work.

The easiest way to demonstrate is by using an example. Take this simple code for instance:

char data[1000000]; // One Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 1000000; ++i )
{
  sum += data[ i ];
}

An array of one million bytes is declared and iterated on one byte at a time. Now let's change things a little to illustrate the underlying hardware. Changes marked in bold:

char data[16000000]; // Sixteen Million bytes
unsigned int sum = 0;

for ( int i = 0; i < 16000000; i += 16 )
{
  sum += data[ i ];
}

The array is changed to contain sixteen million bytes and we iterate over one million of them, skipping 16 at a time.

A quick look suggests there shouldn't be any effect on performance as the code is translated to the same number of instructions and runs the same number of times, however that is not the case. Here is the difference graph. Note that this is on a logarithmic scale--if the scale were linear, the performance difference would be too large to display on any reasonably-sized graph!


Graph in logarithmic scale

The simple change making the loop skip 16 bytes at a time makes the program run 5 times slower!

The average difference in performance is 5x and is consistent when iterating 1,000 bytes up to a million bytes, sometimes increasing up to 7x. This is a serious change in performance.

Note: The benchmark was run on multiple hardware configurations including a desktop with Intel 5930K 3.50GHz CPU, a Macbook Pro Retina laptop with 2.6 GHz Intel i7 CPU and Android Nexus 5 and Nexus 6 devices. The results were pretty consistent.

If you wish to replicate the test, you might have to ensure the memory is out of the cache before running the loop because some compilers will cache the array on declaration. Read below to understand more on how it works.

Explanation

What happens in the example is quite simply explained when you understand how the CPU accesses data. The CPU can’t access data in RAM; the data must be copied to the cache, a smaller but extremely fast memory line which resides near the CPU chip.

When the program starts, the CPU is set to run an instruction on part of the array but that data is still not in the cache, therefore causing a cache miss and forcing the CPU to wait for the data to be copied into the cache.

For simplicity sake, assume a cache size of 16 bytes for the L1 cache line, this means 16 bytes will be copied starting from the requested address for the instruction.

In the first code example, the program next tries to operate on the following byte, which is already copied into the cache following the initial cache miss, therefore continuing smoothly. This is also true for the next 14 bytes. After 16 bytes, since the first cache miss the loop, will encounter another cache miss and the CPU will again wait for data to operate on, copying the next 16 bytes into the cache.

In the second code sample, the loop skips 16 bytes at a time but hardware continues to operate the same. The cache copies the 16 subsequent bytes each time it encounters a cache miss which means the loop will trigger a cache miss with each iteration and cause the CPU to wait idle for data each time!

Note: Modern hardware implements cache prefetch algorithms to prevent incurring a cache miss per frame, but even with prefetching, more bandwidth is used and performance is lower in our example test.

In reality the cache lines tend to be larger than 16 bytes, the program would run much slower if it were to wait for data at every iteration. A Krait-400 found in the Nexus 5 has a L0 data cache of 4 KB with 64 Bytes per line.

If you are wondering why cache lines are so small, the main reason is that making fast memory is expensive.

Data-Oriented Design

The way to solve such performance issues is by designing your data to fit into the cache and have the program to operate on the entire data continuously.

This can be done by organizing your game objects inside Structures of Arrays (SoA) instead of Arrays of Structures (AoS) and pre-allocating enough memory to contain the expected data.

For example, a simple physics object in an AoS layout might look like this:

struct PhysicsObject
{
  Vec3 mPosition;
  Vec3 mVelocity;

  float mMass;
  float mDrag;
  Vec3 mCenterOfMass;

  Vec3 mRotation;
  Vec3 mAngularVelocity;

  float mAngularDrag;
};

This is a common way way to present an object in C++.

On the other hand, using SoA layout looks more like this:

class PhysicsSystem
{
private:
  size_t mNumObjects;
  std::vector< Vec3 > mPositions;
  std::vector< Vec3 > mVelocities;
  std::vector< float > mMasses;
  std::vector< float > mDrags;

  // ...
};

Let’s compare how a simple function to update object positions by their velocity would operate.

For the AoS layout, a function would look like this:

void UpdatePositions( PhysicsObject* objects, const size_t num_objects, const float delta_time )
{
  for ( int i = 0; i < num_objects; ++i )
  {
    objects[i].mPosition += objects[i].mVelocity * delta_time;
  }
}

The PhysicsObject is loaded into the cache but only the first 2 variables are used. Being 12 bytes each amounts to 24 bytes of the cache line being utilised per iteration and causing a cache miss with every object on a 64 bytes cache line of a Nexus 5.

Now let’s look at the SoA way. This is our iteration code:

void PhysicsSystem::SimulateObjects( const float delta_time )
{
  for ( int i = 0; i < mNumObjects; ++i )
  {
    mPositions[ i ] += mVelocities[i] * delta_time;
  }
}

With this code, we immediately cause 2 cache misses, but we are then able to run smoothly for about 5.3 iterations before causing the next 2 cache misses resulting in a significant performance increase!

The way data is sent to the hardware matters. Be aware of data-oriented design and look for places it will perform better than object-oriented code.

We have barely scratched the surface. There is still more to data-oriented programming than structuring your objects. For example, the cache is used for storing instructions and function memory so optimizing your functions and local variables affects cache misses and hits. We also did not mention the L2 cache and how data-oriented design makes your application easier to multithread.

Make sure to profile your code to find out where you might want to implement data-oriented design. You can use different profilers for different architecture, including the NVIDIA Tegra System Profiler, ARM Streamline Performance Analyzer, Intel and PowerVR PVRMonitor.

If you want to learn more on how to optimize for your cache, read on cache prefetching for various CPU architectures.

Join the discussion on

+Android Developers
Categories: Programming

Agency Product Owner Training Starts in August

We have an interesting problem in some projects. Agencies, consulting organizations, and consultants help their clients understand what the client needs in a product. Often, these people and their organizations then implement what the client and agency develop as ideas.

As the project continues, the agency manager continues to help the client identify and update the requirements. Because this a limited time contract, the client doesn’t have a product manager or product owner. The agency person—often the owner—acts as a product owner.

This is why Marcus Blankenship and I have teamed up to offer Product Owner Training for Agencies.

If you are an agency/consultant/outside your client’s organization and you act as a product owner, this training is for you. It’s based on my workshop Agile and Lean Product Ownership. We won’t do everything in that workshop. Because it’s an online workshop, you’ll work on your projects/programs in between our meetings.

If you are not part of an organization and you find yourself acting as a product owner, this training is for you. See Product Owner Training for Agencies.

Categories: Project Management

Do You Really Bill $300 an Hour?

Making the Complex Simple - John Sonmez - Thu, 07/02/2015 - 15:00

In this episode, I explain why I bill $300 an hour. Full transcript: John:¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† Hey, this is John Sonmez from simpleprogrammer.com. I got a question today about my billing rate, so whether it‚Äôs real or not and I found a few questions like this. I just want to say upfront that when I say something […]

The post Do You Really Bill $300 an Hour? appeared first on Simple Programmer.

Categories: Programming

Flappy

Phil Trelford's Array - Thu, 07/02/2015 - 08:04

This week I ran a half-day hands on games development session at the Progressive .Net Tutorials hosted by Skills Matter in London. I believe this was the last conference to be held in Goswell Road before the big move to an exciting new venue.

My session was on mobile games development with F# as the implementation language:

Ready, steady, cross platform games - ProgNet 2015 from Phillip Trelford

Here’s a quick peek inside the room:

Game programming with @ptrelford @ #prognet2015 I'ma make and sell a BILLION copies of this game y'all! pic.twitter.com/5qa4wOSY1G

— Adron (@adron) July 1, 2015

The session tasks were around 2 themes:

  • implement a times table question and answer game (think Nintendo‚Äôs Brain Training game)
  • extended a simple Flappy Bird clone

Times table game

The motivation behind this example was to help people:

  • build a simple game loop
  • pick up some basic F# skills

The first tasks , like asking a multiplication question, could be built using F#’s REPL (F# Interactive) and later tasks that took user input required running as a console application.

Here’s some of the great solutions that were posted up to F# Snippets:

To run them, create a new F# Console Application project in Xamarin Studio or Visual Studio and paste in the code (use the Raw view in F# Snippets to copy the code).

Dominic Finn’s source code includes some fun ASCII art too:

// _____ _   _ _____ _____ _____  ______  _  _   _____  _____ _     
//|  __ \ | | |  ___/  ___/  ___| |  ___|| || |_|  _  ||  _  | |    
//| |  \/ | | | |__ \ `--.\ `--.  | |_ |_  __  _| | | || | | | |    
//| | __| | | |  __| `--. \`--. \ |  _| _| || |_| | | || | | | |    
//| |_\ \ |_| | |___/\__/ /\__/ / | |  |_  __  _\ \_/ /\ \_/ / |____
// \____/\___/\____/\____/\____/  \_|    |_||_|  \___/  \___/\_____/
//  

Flappy Bird clone

For this example I sketched out a flappy bird clone using Monogame (along with WinForms and WPF for comparison) with the idea that people could enhance and extend the game:

image

Monogame lets you target multiple platforms including iOS and Android along with Mac, Linux, Windows and even Rapsberry Pi!

The different flavours are available on F# Snippets, simply cut and paste them into an F# script file to run them:

All the samples and tasks are also available in a zip: http://trelford.com/ProgNet15.zip

Have fun!

Categories: Programming

SE-Radio Episode 231: Joshua Suereth and Matthew Farwell on SBT and Software Builds

Joshua Suereth and Matthew Farwell discuss SBT (Simple Build Tool) and their new book SBT in Action. They first look at the factors creating a need for build systems and why they think SBT‚ÄĒa new addition to this area‚ÄĒis a valuable contribution in spite of the vast number of existing build tools. Host Tobias Kaatz, […]
Categories: Programming

What Happened to our Basic Math Skills?

Herding Cats - Glen Alleman - Wed, 07/01/2015 - 15:29

Screen Shot 2015-06-30 at 10.03.06 PMMaking decisions in the presence of uncertainty of a future outcomes resulting from that decision is an important topic in the project management, product development, and engineering domains. The first question in this domain is...

If the future is not identical to the past, how can we make a decision in the presence of this future uncertainty?

The answer is we need some means of taking what we know about the past and the present and turning it into information about the future. This information can be measurements of actual activities - cost, duration of work, risks, dependencies, performance and effectiveness measures, models and simulation of past and future activities, reference classes, parametric models.

If the future is identical to the past and the present, then all this data can show us a simple straight line projection from the past to the future.

But there are some questions:

  • Is the future like the past? Have we just assumed this? Or have we actually developed an understanding of the future from looking into¬†what could possible change in the future from the past?
  • If there is no change, can that future be sustained long enough for our actions to have a beneficial impact?
  • If we discover the future may not be like the past, what is the statistical behavior of this future, how can we discover this behavior, and how will these changes impact our decision making processes?

The answers to these and many other questions can be found in the mathematics of probability and statistics. Here's some popular misconceptions of mathematical concepts

Modeling is the Key to Decision Making

"All models are wrong, some are useful," George Box and Norman R. Draper (1987). Empirical Model-Building and Response Surfaces, p. 424, Wiley. ISBN 0471810339. 

  • This book is about process control systems and the statistical process models used to design and operate the control systems in chemical plants. (This is a domain I have worked in and developed software for).
  • This quote has been wildly misquoted, not only out of context, but also completely out of the domain it is applicable to.
  • All models are wrong says, every model is wrong because it is a simplification of reality. This is the definition of a model.
  • Some models, in the "hard" sciences, are only a little wrong. They ignore things like friction or the gravitational effect of tiny bodies. Other models are a lot wrong - they ignore bigger things. In the social sciences, big things are ignored.
  • Statistical models are descriptions of ¬†systems using mathematical language. In many cases we can add a certain layer of abstraction to enable an inferential procedure.
  • It is¬†almost impossible¬†for a single model to describe perfectly a real world phenomenon given our ¬†own subjective view of the world, since our sensory system is not perfect.
  • But - and this is the critical misinterpretation of Box's quote - successful statistical inference does happen any a certain degree of consistency we exploit.
  • So our¬†almost always wrong models¬†do prove¬†useful.

We can't possibly estimate activities in the future if we don't already know what they are

We actually do this all the time. But more importantly there are simple step-by-step methods for making credible estimates about unknown - BUT KNOWABLE - outcomes.
This know of unknown but knowable is critical. If we really can't know - it is unknowable - then the work is not a project. It is pure research. So move on, unless you're a PhD researcher.

Here's a little dialog showing how to estimating most anything in the software development world. 
With your knowledge and experience in the domain and a reasonable understanding of what the customer wants (no units of measure for reasonable by the way, sorry), let's ask some questions.

I have no pre-defined expectation of the duration. That is I have no anchor to start. If I did and didn't have a credible estimate I'd be a Dilbert manager - and I'm not.

  • Me¬†- now that you know a little bit about my needed feature, can you develop this in less than 6 months?
  • You¬†- of course I can, I'm not a complete moron.
  • Me¬†- good, I knew I was right to hire you. How about developing this feature in a week?
  • You¬†- are you out of your mind? I'd have to be a complete moron to sign up for that.
  • Me¬†- good, still confirms I hired the right person for the job. How about getting it done in 4 months?
  • You¬†- well that's still seems like too long, but I guess it'll be more than enough time if we run into problems or it turns out you don't really know what you want and change your mind.
  • Me¬†- thanks for the confidence in my ability. How about 6 weeks for this puppy?
  • You¬†- aw come on, now you're making me cranky. I don't know anyone except someone who has done this already, that can do it in 6 weeks. That's a real stretch for me. A real risk of failure and I don't want that. You hired me to be successful, and now you're setting me up for failure.¬†
  • Me¬†- good, just checking. How about 2¬Ĺ months - about 10 weeks?
  • You¬†- yea that still sounds pretty easy, with some margin. I'll go for that.
  • Me¬†- Nice, I like the way you think. How about 7 weeks?¬†
  • You¬†- boy you're a pushy one aren't you. That's a stretch, but I've got some sense of what you want. It's possible, but I can't really commit to being done in that time, it'll be risky but I'll try.
  • Me¬†- good, let's go with 8¬Ĺ weeks for now, and we'll update the estimate after a few weeks of you actually producing output I can look at.

Microeconomics of Decision Making

 Making decisions about the future in the presence of uncertainty can be addressed by microeconomics principles. Microeconomics is a branch of economics that studies the behavior of individuals and small impacting organizations in making decisions on the allocation of limited resources. Projects have limited resources, business has limited resources. All human endeavors have limited resources - time, money, talent, capacity for work, skills, and other unknowns. 

The microeconomics of decision making involves several variables

  • Opportunity cost - the value of what we give up by taking that action. If we decide between A and B and choose B, what is the cost of A that we're giving up.
  • Marginal cost analysis -¬†impact of small changes in the ‚Äúhow-much‚ÄĚ decision.
  • Sunk cost -¬†costs that have already been incurred and cannot be recovered.
  • Present Value -¬†The value today of a future cost or benefit.

Formally, defining this choice problem is simple: there is a state space S, whose elements are called states of nature and represent all the possible realizations of uncertainty; there is an outcome space X , whose elements represent the possible results of any conceivable decision; and there is a preference relation ‚™ł over the mappings from S to X.¬†‚Ć

This of course provides little in a way to make a decision on a project. But the point here is making decisions in the presence of uncertainty is a well developed discipline. Conjecturing it can't be done simply ignores this discipline.

The Valuation of Project Deliverables

It's been conjectured that focusing on value is the basis of good software development efforts. When suggested that this value is independent of cost this is misinformed. Valuation and the resulting Value used to compare choices, is the process of determining the economic value of an asset, be it a created product, a service, or a process. Value is defined as the net worth, or the difference between the benefits produced by the asset and the costs to develop or acquire the asset, all adjusted appropriately for probabilistic risk, at some point in time.

This valuation has several difficulties:

  • Costs and benefits might occur at different points in time and need to be adjusted, or discounted, to account for time value of money. The fundamental principle that money is worth more today than in the future under ordinary economic conditions.¬†
  • Not all determinants of value are known at the time of the valuation since there is uncertainty inherent in all project and business environments.¬†
  • Intangible benefits like learning, growth or emergent opportunities, and embedded flexibility are the primary sources of value in the presence of uncertainty.¬†

The valuation of the outcomes of software projects depends on the analysis of these underlying costs and benefits. A prerequisite for cost-benefit analysis is the identification of the relevant value and cost drivers to produce that value. Both cost and value are probabilistic, driven by  uncertainty - both reducible and irreducible uncertainty

Modeling Uncertainty

In addition to measurable benefits and costs of the software project, the valuation process must consider uncertainty. Uncertainty arises from different sources. Natural uncertainty (aleatory) which is  irreducible. This uncertainty relates to variations in the environment variables. Dealing with irreducible uncertainty requires margin for cost, schedule, and the performance of the outcomes. For both value and cost.

Event based uncertainty (epistemic) which is reducible. That is we can buy down this uncertainty with out actions. We can pay money to find things out. We can pay money to improve the value delivered from the cost we invest to produce that value.

Parameter uncertainty relates to the estimation of parameters (e.g., the reliability of the average number of defects). Model uncertainty relates to the validity of specific models used (e.g., the suitability of a certain distribution to model the defects). There is a straightforward taxonomy of uncertainty for software engineering that includes additional sources such as scope error and assumption error. The standard approach of handling uncertainty is by defining probability distributions for the underlying quantities, allowing the application of a standard calculus. Other approaches based on fuzzy measures or Bayesian networks consider different types of prior knowledge. ‡

The Final Point Once Again

The conjecture we can make informed decisions about choices in an uncertain future can be done in the absence of making estimates of the impacts of these choices has no basis in the mathematics of decision making.

This conjecture is simply not true. Any attempt to show this can be done has yet to materialize in any testable manner. This is where the basic math skills come into play. There is no math that supports this conjecture. Therefore there is no way to test this conjecture. It's personal opinion uninformed by any mathematics.

Proceed with caution when you hear this.

† Decision Theory Under Uncertainty, Johanna Etner, Meglena Jeleva, Jean-Marc Tallon,  Centre d’Economie de la Sorbonne 2009.64

‡ Estimates, Uncertainty and Risk. IEEE Software, 69-74 (May 1997), Kitchenham and Linkman and "Belief Functions in Business Decisions. In: Studies in Fuzziness and Soft Computing, Vol. 88, Srivastava and Mock

Related articles Information Technology Estimating Quality Everything I Learned About PM Came From a Elementary School Teacher Carl Sagan's BS Detector Eyes Wide Shut - A View of No Estimates
Categories: Project Management

Software Architecture for Developers in Chinese

Coding the Architecture - Simon Brown - Wed, 07/01/2015 - 11:14

Although it's been on sale in China for a few months, my copies of the Chinese translation of my Software Architecture for Developers book have arrived. :-)

Software Architecture for Developers

I can't read it, but seeing my C4 diagrams in Chinese is fun! Stay tuned for more translations.

Categories: Architecture

Agile Teams Making Decisions (Refined)

Team members working together.

Team members working together (or not).

In Agile projects, the roles of the classic project manager in Scrum are spread across the three basic roles (Product Owner, Scrum Master and Development Team). A fourth role, the Agile Program Manager (known as a Release Train Engineer in SAFe), is needed when multiple projects are joined together to become a coordinated program.  The primary measures of success in Agile projects are delivered business value and customer satisfaction.  These attributes subsume the classic topics of on-time, on-budget and on-scope. (Note: Delivered value and customer satisfaction should be the primary measure of success in ALL types of projects, however these are not generally how project teams are held accountable.)

As teams learn to embrace and use Agile principles, they need to learn how to make decisions as a team. The decisions that teams need to learn how to make for themselves always have consequences, and sometimes those consequences will be negative. To accomplish this learning process in the least risky manner, the team should use techniques like delaying decisions as late as is practical and delivering completed work within short time boxes. These techniques reduce risk by increasing the time the team has to gather knowledge and by getting the team feedback quickly. The organization also must learn how to encourage the team to make good decisions while giving them the latitude to mess up. This requires the organization to accept some level of explicit initial ambiguity that is caused by delaying decisions, rather than implicit ambiguity of making decisions early that later turn out to be wrong. The organization must also learn to evaluate teams and individuals less on the outcome of a single decision and more on the outcome of the value the team delivered.

Teams also have to unlearn habits, for example, relying on others to plan for them. In order to do that, all leaders and teams must have an understanding of the true goals of the project (listen to my interview with David Marquet) and how the project fits into the strategic goals of the organization.

Teams make designs daily that affect the direction of the sprint and project. The faster these decisions are made the higher the team’s velocity or productivity. Having a solid understanding of the real goals of the project helps the team make decisions more effectively. Organizations need to learn how share knowledge that today is generally compartmentalized between developers, testers or analysts.

The process of learning and unlearning occurs on a continuum as teams push toward a type of collective self-actualization. As any team moves toward its full potential, the organization’s need to control planning and decisions falls away. If the organization doesn’t back away from the tenants of command and control and move towards the Agile principles, the ability of any team to continue to grow will be constrained.  The tipping point generally occurs when an organization realizes that self-managing and self-organizing teams deliver superior value and higher customer satisfaction and that in the long run is what keeps CIOs employed.


Categories: Process Management

R: write.csv ‚Äď unimplemented type ‚Äėlist‚Äô in ‚ÄėEncodeElement‚Äô

Mark Needham - Tue, 06/30/2015 - 23:26

Everyone now and then I want to serialise an R data frame to a CSV file so I can easily load it up again if my R environment crashes without having to recalculate everything but recently ran into the following error:

> write.csv(foo, "/tmp/foo.csv", row.names = FALSE)
Error in .External2(C_writetable, x, file, nrow(x), p, rnames, sep, eol,  : 
  unimplemented type 'list' in 'EncodeElement'

If we take a closer look at the data frame in question it looks ok:

> foo
  col1 col2
1    1    a
2    2    b
3    3    c

However, one of the columns contains a list in each cell and we need to find out which one it is. I’ve found the quickest way is to run the typeof function over each column:

> typeof(foo$col1)
[1] "double"
 
> typeof(foo$col2)
[1] "list"

So ‘col2′ is the problem one which isn’t surprising if you consider the way I created ‘foo':

library(dplyr)
foo = data.frame(col1 = c(1,2,3)) %>% mutate(col2 = list("a", "b", "c"))

If we do have a list that we want to add to the data frame we need to convert it to a vector first so we don’t run into this type of problem:

foo = data.frame(col1 = c(1,2,3)) %>% mutate(col2 = list("a", "b", "c") %>% unlist())

And now we can write to the CSV file:

write.csv(foo, "/tmp/foo.csv", row.names = FALSE)
$ cat /tmp/foo.csv
"col1","col2"
1,"a"
2,"b"
3,"c"

And that’s it!

Categories: Programming

GTAC 2015: Call for Proposals & Attendance

Google Testing Blog - Tue, 06/30/2015 - 22:11
Posted by Anthony Vallone on behalf of the GTAC Committee

The GTAC (Google Test Automation Conference) 2015 application process is now open for presentation proposals and attendance. GTAC will be held at the Google Cambridge office (near Boston, Massachusetts, USA) on November 10th - 11th, 2015.

GTAC will be streamed live on YouTube again this year, so even if you can’t attend in person, you’ll be able to watch the conference remotely. We will post the live stream information as we get closer to the event, and recordings will be posted afterward.

Speakers
Presentations are targeted at student, academic, and experienced engineers working on test automation. Full presentations are 30 minutes and lightning talks are 10 minutes. Speakers should be prepared for a question and answer session following their presentation.

Application
For presentation proposals and/or attendance, complete this form. We will be selecting about 25 talks and 200 attendees for the event. The selection process is not first come first serve (no need to rush your application), and we select a diverse group of engineers from various locations, company sizes, and technical backgrounds (academic, industry expert, junior engineer, etc).

Deadline
The due date for both presentation and attendance applications is August 10th, 2015.

Fees
There are no registration fees, but speakers and attendees must arrange and pay for their own travel and accommodations.

More information
You can find more details at developers.google.com/gtac.

Categories: Testing & QA

Debian Size Claims - New Lecture Posted

10x Software Development - Steve McConnell - Tue, 06/30/2015 - 19:17

In this week's lecture (https://cxlearn.com) I demonstrate how to use some of the size information we've discussed in other lectures by diving into the Wikipedia claims about the sizes of various versions of Debian.  The point of this week's lecture is to show how to apply critical thinking to size information presented by an authoritative source (Wikipedia), and how to arrive at a confident conclusion that that information is not credible. Practicing software professionals should be able to look at size claims like the Debian size claims and, based on general knowledge, immediately think, "That seems far from credible." Yet, few professionals actually do that. My hope is that working through public examples like this in the lecture series will help software professionals improve their instincts and judgment, which can then be applied to projects in their own organizations. 

Lectures posted so far include:  

0.0 Understanding Software Projects - Intro
     0.1 Introduction - My Background
     0.2 Reading the News
     0.3 Definitions and Notations 

1.0 The Software Lifecycle Model - Intro
     1.1 Variations in Iteration 
     1.2 Lifecycle Model - Defect Removal
     1.3 Lifecycle Model Applied to Common Methodologies
     1.4 Lifecycle Model - Selecting an Iteration Approach  

2.0 Software Size
     2.05 Size - Comments on Lines of Code
     2.1 Size - Staff Sizes 
     2.2 Size - Schedule Basics 
     2.3 Size - Debian Size Claims (New)

Check out the lectures at http://cxlearn.com!

Understanding Software Projects - Steve McConnell

 

Succeeding with Geographically Distributed Scrum Teams - New White Paper

10x Software Development - Steve McConnell - Tue, 06/30/2015 - 19:02

We have a new white paper, "Succeeding with Geographically Distributed Scrum Teams." To quote the white paper itself: 

When organizations adopt Agile throughout the enterprise, they typically apply it to both large and small projects. The gap is that most Agile methodologies, such as Scrum and XP, are team-level workflow approaches. These approaches can be highly effective at the team level, but they do not address large project architecture, project management, requirements, and project planning needs. Our clients find that succeeding with Scrum on a large, geographically distributed team requires adopting additional practices to ensure the necessary coordination, communication, integration, and architectural work. This white paper discusses common considerations for success with geographically distributed Scrum.

Check it out!

What’s new with Google Fit: Distance, Calories, Meal data, and new apps and wearables

Google Code Blog - Tue, 06/30/2015 - 18:52

Posted by Angana Ghosh, Lead Product Manager, Google Fit

To help users keep track of their physical activity, we recently updated the Google Fit app with some new features, including an Android Wear watch face that helps users track their progress throughout the day. We also added data types to the Google Fit SDK and have new partners tracking data (e.g. nutrition, sleep, etc.) that developers can now use in their own apps. Find out how to integrate Google Fit into your app and read on to check out some of the cool new stuff you can do.

table, th, td { border: clear; border-collapse: collapse; }

Distance traveled per day

The Google Fit app now computes the distance traveled per day. Subscribe to it using the Recording API and query it using the History API.

Calories burned per day

If a user has entered their details into the Google Fit app, the app now computes their calories burned per day. Subscribe to it using the Recording API and query it using the History API.

Nutrition data from LifeSum, Lose It!, and MyFitnessPal

LifeSum and Lose It! are now writing nutrition data, like calories consumed, macronutrients (proteins, carbs, fats), and micronutrients (vitamins and minerals) to Google Fit. MyFitnessPal will start writing this data soon too. Query it from Google Fit using the History API.

Sleep activity from Basis Peak and Sleep as Android

Basis Peak and Sleep as Android are now writing sleep activity segments to Google Fit. Query this data using the History API.

New workout sessions and activity data from even more great apps and fitness wearables!

Endomondo, Garmin, the Daily Burn, the Basis Peak and the Xiaomi miBand are new Google Fit partners that will allow users to store their workout sessions and activity data. Developers can access this data with permission from the user, which will also be shown in the Google Fit app.

How are developers using the Google Fit platform?

Partners like LifeSum, and Lose It! are reading all day activity to help users keep track of their physical activity in their favorite fitness app.

Runkeeper now shows a Google Now card to its users encouraging them to ‚Äúwork off‚ÄĚ their meals, based on their meals written to Google Fit by other apps.

Instaweather has integrated Google Fit into a new Android Wear face that they’re testing in beta. To try out the face, first join this Google+ community and then follow the link to join the beta and download the app.

We hope you enjoy checking out these Google Fit updates. Thanks to all our partners for making it possible! Find out more about integrating the Google Fit SDK into your app.

Categories: Programming

Prioritize and Optimize Over a Slightly Longer Horizon

Mike Cohn's Blog - Tue, 06/30/2015 - 15:00

A lot of agile literature stresses that product owners must prioritize the delivery of value. I’m not going to argue with that. But I am going to argue that product owners need to optimize over a slightly longer horizon than a single sprint.

A product owner’s goal is to maximize the amount of value delivered over the life a product. If the product owner shows up at each sprint planning meeting focused on maximizing the value delivered in only that one sprint, the product owner will never choose to invest in the system’s future.

The product owner will instead always choose to deliver the feature that delivers the most value in the immediate future regardless of the long-term benefit. Such a product owner is like the pleasure-seeking student who parties every night during the semester but fails, and has to repeat the course during the summer.

A product owner with a short time horizon may have the team work on features A1, B1, C1 and D1 in four different parts of the application this sprint because those are the highest valued features.

A product owner with a more appropriate, longer view of the product will instead have the team work on A1, A2, A3 and A4 in the first sprint because there are synergies to working on all the features in area A at the same time.

Agile is still about focusing on value. And it’s still about making sure we deliver value in the short term. We just don’t want to become shortsighted about it.

How to create the smallest possible docker container of any image

Xebia Blog - Tue, 06/30/2015 - 10:46

Once you start to do some serious work with Docker, you soon find that downloading images from the registry is a real bottleneck in starting applications. In this blog post we show you how you can reduce the size of any docker image to just a few percent of the original. So is your image too fat, try stripping your Docker image! The strip-docker-image utility demonstrated in this blog makes your containers faster and safer at the same time!


We are working quite intensively on our High Available Docker Container Platform  using CoreOS and Consul which consists of a number of containers (NGiNX, HAProxy, the Registrator and Consul). These containers run on each of the nodes in our CoreOS cluster and when the cluster boots, more than 600Mb is downloaded by the 3 nodes in the cluster. This is quite time consuming.

cargonauts/consul-http-router      latest              7b9a6e858751        7 days ago          153 MB
cargonauts/progrium-consul         latest              32253bc8752d        7 weeks ago         60.75 MB
progrium/registrator               latest              6084f839101b        4 months ago        13.75 MB

The size of the images is not only detrimental to the boot time of our platform, it also increases the attack surface of the container.  With 153Mb of utilities in the  NGiNX based consul-http-router, there is a lot of stuff in the container that you can use once you get inside. As we were thinking of running this router in a DMZ, we wanted to minimise the amount of tools lying around for a potential hacker.

From our colleague Adriaan de Jonge we already learned how to create the smallest possible Docker container  for a Go program. Could we repeat this by just extracting the NGiNX executable from the official distribution and copying it onto a scratch image?  And it turns out we can!

finding the necessary files

Using the utility dpkg we can list all the files that are installed by NGiNX

docker run nginx dpkg -L nginx
...
/.
/usr
/usr/sbin
/usr/sbin/nginx
/usr/share
/usr/share/doc
/usr/share/doc/nginx
...
/etc/init.d/nginx
locating dependent shared libraries

So we have the list of files in the package, but we do not have the shared libraries that are referenced by the executable. Fortunately, these can be retrieved using the ldd utility.

docker run nginx ldd /usr/sbin/nginx
...
	linux-vdso.so.1 (0x00007fff561d6000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fd8f17cf000)
	libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007fd8f1598000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007fd8f1329000)
	libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007fd8f10c9000)
	libcrypto.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007fd8f0cce000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd8f0ab2000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd8f0709000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fd8f19f0000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd8f0505000)
Following and including symbolic links

Now we have the executable and the referenced shared libraries, it turns out that ldd normally names the symbolic link and not the actual file name of the shared library.

docker run nginx ls -l /lib/x86_64-linux-gnu/libcrypt.so.1
...
lrwxrwxrwx 1 root root 16 Apr 15 00:01 /lib/x86_64-linux-gnu/libcrypt.so.1 -> libcrypt-2.19.so

By resolving the symbolic links and including both the link and the file, we are ready to export the bare essentials from the container!

getpwnam does not work

But after copying all essentials files to a scratch image, NGiNX did not start.  It appeared that NGiNX tries to resolve the user id 'nginx' and fails to do so.

docker run -P  --entrypoint /usr/sbin/nginx stripped-nginx  -g "daemon off;"
...
2015/06/29 21:29:08 [emerg] 1#1: getpwnam("nginx") failed (2: No such file or directory) in /etc/nginx/nginx.conf:2
nginx: [emerg] getpwnam("nginx") failed (2: No such file or directory) in /etc/nginx/nginx.conf:2

It turned out that the shared libraries for the name switch service reading /etc/passwd and /etc/group are loaded at runtime and not referenced in the shared libraries. By adding these shared libraries ( (/lib/*/libnss*) to the container, NGiNX worked!

strip-docker-image example

So now, the strip-docker-image utility is here for you to use!

    strip-docker-image  -i image-name
                        -t target-image-name
                        [-p package]
                        [-f file]
                        [-x expose-port]
                        [-v]

The options are explained below:

-i image-name           to strip
-t target-image-name    the image name of the stripped image
-p package              package to include from image, multiple -p allowed.
-f file                 file to include from image, multiple -f allowed.
-x port                 to expose.
-v                      verbose.

The following example creates a new nginx image, named stripped-nginx based on the official Docker image:

strip-docker-image -i nginx -t stripped-nginx  \
                           -x 80 \
                           -p nginx  \
                           -f /etc/passwd \
                           -f /etc/group \
                           -f '/lib/*/libnss*' \
                           -f /bin/ls \
                           -f /bin/cat \
                           -f /bin/sh \
                           -f /bin/mkdir \
                           -f /bin/ps \
                           -f /var/run \
                           -f /var/log/nginx \
                           -f /var/cache/nginx

Aside from the nginx package, we add the files /etc/passwd, /etc/group and /lib/*/libnss* shared libraries. The directories /var/run, /var/log/nginx and /var/cache/nginx are required for NGiNX to operate. In addition, we added /bin/sh and a few handy utilities, just to be able to snoop around a little bit.

The stripped image has now shrunk to an incredible 5.4% of the original 132.8 Mb to just 7.3Mb and is still fully operational!

docker images | grep nginx
...
stripped-nginx                     latest              d61912afaf16        21 seconds ago      7.297 MB
nginx                              1.9.2               319d2015d149        12 days ago         132.8 MB

And it works!

ID=$(docker run -P -d --entrypoint /usr/sbin/nginx stripped-nginx  -g "daemon off;")
docker run --link $ID:stripped cargonauts/toolbox-networking curl -s -D - http://stripped
...
HTTP/1.1 200 OK

For HAProxy, checkout the examples directory.

Conclusion

It is possible to use the official images that are maintained and distributed by Docker and strip them down to their bare essentials, ready for use! It speeds up load times and reduces the attack surface of that specific container.

Checkout the github repository for the script and the manual page.

Please send me your examples of incredibly shrunk Docker images!

Making Decisions In The Presence of Uncertainty

Herding Cats - Glen Alleman - Tue, 06/30/2015 - 03:57

Screen Shot 2015-06-29 at 5.30.17 PMDecision making is hard. Decision making is easy when we know what to do. When we don't know what to do there are conflicting choices that must be balanced in the presence of uncertainty for each of those choices. The bigger issue is that important choices are usually ones where we know the least about the outcomes and the cost and schedule to achieve those outcomes. 

Decision science evolved to cope with decision making in the presence of uncertainty. This approach goes back to Bernoulli in the early 1700s, but remained an academic subject into the 20th century, because there was no satisfactory way to deal with the complexity of real life. Just after World War II, the fields of systems analysis and operations research began to develop. With the help of computers, it became possible to analyze problems of great complexity in the presence of uncertainty.

In 1938, Chester Barnard, authored of¬†The Functions of the Executive,¬†and coined the term ‚Äúdecision making‚ÄĚ from the lexicon of public administration into the business world. This term replaced narrower descriptions such as ‚Äúresource allocation‚ÄĚ and ‚Äúpolicy making.‚ÄĚ

Decision analysis functions at four different levels

  • Philosophy - uncertainty is a consequence of our incomplete knowledge of the world. In some cases, uncertainty can be partially or completely resolved before decisions are made and resources committed. In many important cases, complete information is not available or is too expensive (in time, money, or other resources) to obtain.
  • Decision framework - decision analysis provides concepts and language to help the decision-maker. The decision maker is aware of the adequacy or inadequacy of the decision basis:
  • Decision-making process - provides a step-by-step procedure that has proved practical in tackling even the most complex problems in an efficient and orderly way.
  • Decision making¬†methodology - provides a number of specific tools that are sometimes indispensable in analyzing a decision problem. These tools include procedures for eliciting and constructing influence diagrams, probability trees, and decision trees; procedures for encoding probability functions and utility curves; and a methodology for evaluating these trees and obtaining information useful to further refine the analysis.

Each level focuses on different aspects of the problem of making decisions. And it is decision making that we're after.  The purpose of the analysis is not to obtain a set of numbers describing decision alternatives. It is to provide the decision-maker the insight needed to choose between alternatives. These insights typically have three elements:

  • What is important to making the decision?
  • Why is it important?
  • How important is it?

Now To The Problem at Hand

It has been conjectured ...

No Estimates

The key here and the critical unanswered question is how can a decision about an outcome in the future, in the presence of that uncertain future, be made in the absence of estimating the attributes going into that decision?

That is, if we have less than acceptable knowledge about a future outcome, how can we make a decision about the choices involved in that outcome?

Dealing with Uncertainty

All project work operates in the presence of uncertainty. The underlying statistical processes create probabilistic outcomes for future activities. These activities may be probabilistic events, or the naturally occurring variances of the processes that make up the project. 

Clarity of discussion through the language of probability is one of the basis of decision analysis. The reality of uncertainty must be confronted and described, and the mathematics of probability is the natural language to describe uncertainty.

When we don't have the clarity of language, when redefining mathematical terms, misusing mathematical terms, enters the conversation, agreeing on the ways - and there are many ways - of making decisions in the presence of an uncertain future - becomes bogged down in approaches that can't be tested in any credible manner. What remains is personal opinion, small sample anecdotes, and attempts to solve complex problems with simple and simple minded approaches. 

For every complex problem there is an answer that is clear, simple, and wrong. H. L. Mencken

Related articles Eyes Wide Shut - A View of No Estimates Carl Sagan's BS Detector Systems Thinking, System Engineering, and Systems Management
Categories: Project Management