Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Feed aggregator

Voxxed interview and 20% discount on my Parleys course

Voxxed have just published a short interview with me about software architecture, sketches, agile and my "Software Architecture for Developers" training course on Parleys where I answer the following questions:

  1. You're an independent consultant - have your experiences in this (sometimes challenging) field fed into your course?
  2. Who is your course aimed at? How experienced do people need to be?
  3. Do you think a good grasp of agile methodology is important for this course?
  4. Can you give us an example of the kind of sketch you'd use to visualize your architecture?
  5. What's wrong with many of the software architecture sketches that you see?
  6. Diagrams that don't reflect the code - why is this a problem?
  7. A recent article suggested young developers should avoid the agile manifesto - what's your take on this?

You can read the full interview on Voxxed and, this week, the first 100 people to sign-up to my Parleys course using this link will get a 20% discount.

Software Architecture for Developers

Categories: Architecture

Paper: Immutability Changes Everything by Pat Helland

I was excited to see that Pat Helland has published another thought provoking paper: Immutability Changes Everything. If video is more your style, Pat gave a wonderful talk on the same subject at RICON2012 (videoslides).

It's fun to see how Pat's thinking is evolving over time as he's worked at Tandem Computers (TransactionMonitoring Facility), Amazon, Microsoft (Microsoft Transaction Server and SQL Service Broker), and now Salesforce.

You might have enjoyed some of Pat's other visionary papers: Life beyond Distributed Transactions: an Apostate’s OpinionThe end of an architectural era: (it's time for a complete rewrite), and Idempotence Is Not a Medical Condition.

This new paper is a high level overview of why immutability, the idea that destructive updates are not allowed, is a huge architectural win and because of cheaper disk, RAM, and compute, it's now financially feasible to keep all the things. The key insight is that without data updates, coordination in a distributed system becomes a much simpler problem to solve.

Immutability is an architectural concept that's been gaining steam on several fronts. Facebook is using a declarative immutable programming model in both the model and the view. We are seeing the idea of immutable infrastructure rise in DevOps. Aeron is a new messaging system that uses a persistent log to good advantage. The Lambda Architecture makes use of immutability. Datomic is a database data that treats data as a time-ordered series of immutable objects.

If that's of interest, then you'll like the paper.

Overview:

Categories: Architecture

Should I Work On Non-Work Things At Work?

I’ve received a lot of questions lately about whether or not it is appropriate to work on non-work things at work. This isn’t an easy question to answer and every situation is a bit different, but I thought I’d offer some general advice that can help you figure out the answer for yourself. Doing something is better than doing nothing ... Read More

The post Should I Work On Non-Work Things At Work? appeared first on Simple Programmer.

Categories: Programming

How a Background in Science Will Prepare Me for a Career in Software Requirements

¬†‚ÄúWhat do you plan to do with your degree?‚ÄĚ It‚Äôs a question most everyone hears in college. For some, it‚Äôs easier to answer than it is for others. If you majored in a concentration that traditionally qualifies you to apply for graduate school, this question can be particularly difficult. I received my degree in Neuroscience […]
Categories: Requirements

Quote of the Month January 2015

From the Editor of Methods & Tools - 13 hours 43 min ago
Principles Trump Diagrams. Most of the problems in using the 1988 spiral model stemmed from users looking at the diagram and constructing processes that had nothing to do with the underlying concepts. This is true of many process models: people just adopt the diagram and neglect the principles that need to be respected. Source: The Incremental Commitment Spiral Model, Barry Boehm,, Jo Ann Lane, Supannika Koolmanojwong & Richard Turner, Addison-Wesley

SPaMCAST 326 ‚Äď Steve Tendon, Tame The Flow

www.spamcast.net

http://www.spamcast.net

 

Listen to the Software Process and Measurement Cast

Subscribe to the Software Process and Measurement Cast on ITunes

Software Process and Measurement Cast features our Interview with Steve Tendon. We discussed his new book Tame The Flow: Hyper-Productive Knowledge-Work Performance, The TameFlow Approach and Its Application to Scrum and Kanban published J Ross. Steve discussed how to lead knowledge workers and build a hyper-performing knowledge work organization.  We talked about the four flows,   psychology, information, work and finance that affect performance.  Steve’s ideas can be used to help teams can raise their game to deliver results that not only raise the bar but jump over it.

Steve has a great offer for SPaMCAST listeners. Check out https://tameflow.com/spamcast for a way to get Tame The Flow: Hyper-Productive Knowledge-Work Performance, The TameFlow Approach and Its Application to Scrum and Kanban at 40% off the list price.

Steve’s Bio

Steve Tendon, creator of the TameFlow management approach, is a senior, multilingual, executive management consultant, experienced at leading and directing multi­national and distributed knowledge-­work organizations. He is an expert in organizational performance transformation programs. Mr. Tendon is a sought-after adviser, coach, mentor and consultant, as well as author and speaker, specializing in organizational productivity, organizational design, process excellence and process innovation. Steve helps businesses create high-performance organizations and teams and holds a MSc. in Software Project Management from the University of Aberdeen.

Mr. Tendon has published numerous articles and is a contributing author to¬†Agility Across Time and Space: Implementing Agile Methods in Global Software Projects. Steve is currently a Director at TameFlow Consulting Ltd, where he helps clients achieve outstanding organizational performance by applying the theories and practices described in this book. Mr. Tendon has held senior Software Engineering Management roles at various firms over the course of his career, including the role of¬†Technical Director for the Italian branch of Borland International, the birthplace of hyper-productivity in software development. Borland’s development of¬†Quattro Pro for Windows¬†remains the most productive software project ever documented. This case was¬†Mr. Tendon‚Äôs source of inspiration that lead to his development of the¬†TameFlow¬†perspective and management approach.

Contact Information:

Web: https://tameflow.com/

Web: http://tendon.net/

Twitter: @tendon

 

Next

In the next Software Process and Measurement Cast will feature our essay on the ubiquitous stand-up meeting. The stand-up meeting has become a feature of agile and non-agile project alike.  The technique can be a powerful force to improve team effectiveness and cohesion or it a can really make a mess out of things!  We explore how to get more of the former and less of the later!

 

Call to action!

We are just completed a re-read John Kotter’s classic Leading Change on the Software Process and Measurement Blog (www.tcagley.wordpress.com).  Please feel free to jump in and add your thoughts and comments!

Next week we will start the process to choose the next book based on the list you have suggested.  You can still influence the possible choices for the next re-read by answering the following question:

What are the two books that have most influenced you career (business, technical or philosophical)?  Send the titles to spamcastinfo@gmail.com.

We will publish the list next week on the blog and ask you to vote on the next book for ¬†‚ÄúRe-read‚ÄĚ Saturday. ¬†Feel free to choose you platform; send an email, leave a message on the blog, Facebook or just tweet the list (use hashtag #SPaMCAST)!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques¬†co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: ‚ÄúThis book will prove that software projects should not be a tedious process, neither for you or your team.‚ÄĚ Support SPaMCAST by buying the book¬†here.

Available in English and Chinese.


Categories: Process Management

SPaMCAST 326 - Steve Tendon, Tame The Flow

Software Process and Measurement Cast - Sun, 01/25/2015 - 23:00

Software Process and Measurement Cast features our Interview with Steve Tendon. We discussed his new book Tame The Flow: Hyper-Productive Knowledge-Work Management published J Ross. Steve discussed how to lead knowledge workers and build a hyper-performing knowledge work organization.  We talked about the four flows,   psychology, information, work and finance that affect performance.  Steve’s ideas can be used to help teams can raise their game to deliver results that not only raise the bar but jump over it.

Steve has a great offer for SPaMCAST listeners. Check out https://tameflow.com/spamcast for a way to get Tame The Flow: Hyper-Productive Knowledge-Work Management at 40% off the list price.

Steve’s Bio

Steve Tendon, creator of the TameFlow management approach, is a senior, multilingual, executive management consultant, experienced at leading and directing multi­national and distributed knowledge-­work organizations. He is an expert in organizational performance transformation programs. Mr. Tendon is a sought-after adviser, coach, mentor and consultant, as well as author and speaker, specializing in organizational productivity, organizational design, process excellence and process innovation. Steve helps businesses create high-performance organizations and teams and holds a MSc. in Software Project Management from the University of Aberdeen.

 

Mr. Tendon has published numerous articles and is a contributing author to Agility Across Time and Space: Implementing Agile Methods in Global Software Projects. Steve is currently a Director at TameFlow Consulting Ltd, where he helps clients achieve outstanding organizational performance by applying the theories and practices described in this book. Mr. Tendon has held senior Software Engineering Management roles at various firms over the course of his career, including the role of Technical Director for the Italian branch of Borland International, the birthplace of hyper-productivity in software development. Borland's development of Quattro Pro for Windows remains the most productive software project ever documented. This case was Mr. Tendon’s source of inspiration that lead to his development of the TameFlow perspective and management approach.

Contact Information:

Web: https://tameflow.com/
Web: http://tendon.net/
Twitter: @tendon

Next

In the next Software Process and Measurement Cast will feature our essay on the ubiquitous stand-up meeting. The stand-up meeting has become a feature of agile and non-agile project alike.  The technique can be a powerful force to improve team effectiveness and cohesion or it a can really make a mess out of things!  We explore how to get more of the former and less of the later!

Call to action!

We are just completed a re-read John Kotter’s classic Leading Change on the Software Process and Measurement Blog (www.tcagley.wordpress.com).  Please feel free to jump in and add your thoughts and comments!

Next week we will start the process to choose the next book based on the list you have suggested.  You can still influence the possible choices for the next re-read by answering the following question:

What are the two books that have most influenced you career (business, technical or philosophical)?  Send the titles to spamcastinfo@gmail.com..

We will publish the list next week on the blog and ask you to vote on the next book for  “Re-read” Saturday.  Feel free to choose you platform; send an email, leave a message on the blog, Facebook or just tweet the list (use hashtag #SPaMCAST)!

Shameless Ad for my book!

Mastering Software Project Management: Best Practices, Tools and Techniques co-authored by Murali Chematuri and myself and published by J. Ross Publishing. We have received unsolicited reviews like the following: “This book will prove that software projects should not be a tedious process, neither for you or your team.” Support SPaMCAST by buying the book here.

Available in English and Chinese.

Categories: Process Management

Python: Find the highest value in a group

Mark Needham - Sun, 01/25/2015 - 13:47

In my continued playing around with a How I met your mother data set I needed to find out the last episode that happened in a season so that I could use it in a chart I wanted to plot.

I had this CSV file containing each of the episodes:

$ head -n 10 data/import/episodes.csv
NumberOverall,NumberInSeason,Episode,Season,DateAired,Timestamp
1,1,/wiki/Pilot,1,"September 19, 2005",1127084400
2,2,/wiki/Purple_Giraffe,1,"September 26, 2005",1127689200
3,3,/wiki/Sweet_Taste_of_Liberty,1,"October 3, 2005",1128294000
4,4,/wiki/Return_of_the_Shirt,1,"October 10, 2005",1128898800
5,5,/wiki/Okay_Awesome,1,"October 17, 2005",1129503600
6,6,/wiki/Slutty_Pumpkin,1,"October 24, 2005",1130108400
7,7,/wiki/Matchmaker,1,"November 7, 2005",1131321600
8,8,/wiki/The_Duel,1,"November 14, 2005",1131926400
9,9,/wiki/Belly_Full_of_Turkey,1,"November 21, 2005",1132531200

I started out by parsing the CSV file into a dictionary of (seasons -> episode ids):

import csv
from collections import defaultdict
 
seasons = defaultdict(list)
with open("data/import/episodes.csv", "r") as episodesfile:
    reader = csv.reader(episodesfile, delimiter = ",")
    reader.next()
    for row in reader:
        seasons[int(row[3])].append(int(row[0]))
 
print seasons

which outputs the following:

$ python blog.py
defaultdict(<type 'list'>, {
  1: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], 
  2: [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44], 
  3: [45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64], 
  4: [65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], 
  5: [89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112], 
  6: [113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136], 
  7: [137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160], 
  8: [161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184], 
  9: [185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208]})

It’s reasonably easy to transform that into a dictionary of (season -> max episode id) with the following couple of lines:

for season, episode_ids in seasons.iteritems():
    seasons[season] = max(episode_ids)
 
>>> print seasons
defaultdict(<type 'list'>, {1: 22, 2: 44, 3: 64, 4: 88, 5: 112, 6: 136, 7: 160, 8: 184, 9: 208})

This works fine but it felt very much like a dplyr problem to me so I wanted to see whether I could write something cleaner using pandas.

I started out by capturing the seasons and episode ids in separate lists and then building up a DataFrame:

import pandas as pd
from pandas import DataFrame
 
seasons, episode_ids = [], []
with open("data/import/episodes.csv", "r") as episodesfile:
    reader = csv.reader(episodesfile, delimiter = ",")
    reader.next()
    for row in reader:
        seasons.append(int(row[3]))
        episode_ids.append(int(row[0]))
 
df = DataFrame.from_items([('Season', seasons), ('EpisodeId', episode_ids)])
 
>>> print df.groupby("Season").max()["EpisodeId"]
Season
1          22
2          44
3          64
4          88
5         112
6         136
7         160
8         184
9         208

Or we can simplify that and read the CSV file directly into a DataFrame:

df = pd.read_csv('data/import/episodes.csv', index_col=False, header=0)
 
>>> print df.groupby("Season").max()["NumberOverall"]
Season
1          22
2          44
3          64
4          88
5         112
6         136
7         160
8         184
9         208

Pretty neat. I need to get more into pandas.

Categories: Programming

Re-read Saturday: Part Three: Implications for the Twenty-First Century, John P. Kotter Chapters 11 and 12

index

We complete the re-read of John P. Kotter’s book Leading Change by reviewing the implications from the last two chapters of the book.  Part Three paints the picture of a world in which the urgency for change will not abate and perhaps might even increase.  In chapter 11, titled The Organization of the Future, Kotter suggests that while in the past a single key leader can drive change, collaboration at the top of organizations is now required due to both the rate and complexity of change.  He argues that one person simply can’t have the time and expertise to manage, lead, communication, provide vision . . .  you get the point.  The message in the chapter is that for organizations of any type to prosper in the 21st century the ability to create and communicate vision is critical.  That skill needs to be fostered and developed over the long term just as any other significant organizational asset.  Long-term and continuous development of leadership is not accomplished simply by providing in a two-week course in leadership. While leadership is critical, it only goes so far in creating and fostering change and must be supplemented by a culture of empowerment. Broad-based empowerment allows organizations to tap a wide range of knowledge and energy at all levels of the organization.

Boiling the message of Chapter 11 down, Kotter suggests that an organization that will be at home with the dynamic nature of the 21st century will require a lean, non-bureaucratic structure that leverages a wide range of performance data. For example, in an empowered organization performance data must be gathered and analyzed from many sources. Performance data (e.g. customer satisfaction, productivity, returns, quality and others) gains maximum power when everyone has access to the data in order to drive continuous improvements. The culture of the new organization needs to shift from internally focused and command and control to an externally focused, non-bureaucratic organization. While Kotter does not use the terms lean and Agile, the organization he describes as tuned to the 21st Century reflects the tenants of lean and agile.

Chapter 12, titled Leadership and Lifelong Learning, circles back to the concept of leadership. It is a constant thread across all facets of the eight-stage model of change detailed in Leading Change. Kotter describes the need for leaders to continually develop competitive capacity (the capability to deal with an increasing competitive and dynamic environment). The model Kotter uses to describe the development competitive capacity begins with personal history and flows through competitive drive, lifelong learning, skills and abilities to competitive capacity.  Lifelong learning is an input and a tool for developing and honing skills and abilities. Skills and abilities feed competitive capacity. In our re-read of The Seven Habits of Highly Effective People, Stephen Covey culminated the Seven Habit with the habit call Sharpening the Saw.  Sharpening the Saw is a prescription for balanced self-renewal.  Life-long learning is an important component in balanced self-renewal. Whether you read Kotter or Covey the need to continuously learn is an inescapable necessity of any leaders.

As a rule, I am never overwhelmed by the chapters after the meat of most self-help books (I consider Leading Change a management self-help book, part of a continuum that Covey’s Seven Habits of Highly Effective People would be found on also). Part Three of Leading Change ties the book together by reinforcing the need for the eight-stage model for change and to address the need for continuously sharpening the saw.¬† Kotter‚Äôs model is a tool that requires leaders to apply therefore organizations and leaders must foster the capacity to address needed changes.

Re-read Summary

Change is a fact of life. John P. Kotter’s book, Leading Change, defines his famous eight-stage model for change. The first stage of the model is establishing a sense of urgency. A sense of urgency provides the energy and rational for any large, long-term change program. Once a sense of urgency has been established, the second stage in the eight-stage model for change is the establishment of a guiding coalition. If a sense of urgency provides energy to drive change, a guiding coalition provides the power for making change happen. A vision, built on the foundation of urgency and a guiding coalition, represents a picture of a state of being at some point in the future. Developing a vision and strategy is only a start, the vision and strategy must be clearly and consistently communicated to build the critical mass needed to make change actually happen. Once an organization wound up and primed, the people within the organization must be empowered and let loose to create change. Short-term wins provide the feedback and credibility needed to deliver on the change vision. The benefits and feedback from the short-term wins and other environmental feedback are critical for consolidating gains and producing more change. Once a change has been made it needs to anchored so that that the organization does not revert to older, comfortable behaviors throwing away the gains they have invest blood, sweat and tears to create.

The need for change is not abating. The eight-stage model for change requires leadership and vision.  Organizations need to foster leadership while both organizations and the people in those organizations must continually learn and hone their skills.

Next week we will review the list of books that readers of the blog and listeners to the podcast have identified as having a major impact on their career to vote on the next book we will tackle on Re-read Saturday.  Right now The Mythical Man Month by Fred Brooks is at the top of the list.  Care to influence the list?  Let me know the two books that most influenced your career.


Categories: Process Management

Stuff The Internet Says On Scalability For January 23rd, 2015

Hey, it's HighScalability time:


Elon Musk: The universe is really, really big  [Gigapixels of Andromeda [4K]]
  • 90: is the new 50 for woman designer; $656.8 million: 3 months of Uber payouts; $10 billion: all it takes to build the Internet in space; 1 billion: registered WeChat users
  • Quotable Quotes:
    • @antirez: Tech stacks, more replaceable than ever: hardware is better, startups get $$ (few nodes + or - who cares), alternatives countless.
    • Olivio Sarikas: If every Star in this Image was a 2 millimeter Sandcorn you would end up with 1110 kg of Sand!!!!!!!!!
    • Chad Cipoletti: In even simpler terms, we see brands as people.
    • @timoreilly: Love it: “We need a stack, not a pile” says @michalmigurski.
    • @neha: I would be very happy to never again see a distributed systems paper eval on a workload that would fit on one machine.
    • @etherealmind: OH: "oh yeah, the extra 4 PB of storage is being installed today. Its about 4 racks of gear".
    • @lintool: Andrew Moore: Google's ecommerce platform ingests 100K-200K events per second continuously. 

  • Programming as myth building. Myths to Live By: The true symbol does not merely point to something else. It contains in itself a structure which awakens our consciousness to a new awareness of the inner meaning of life and of reality itself. A true symbol takes us to the center of the circle, not to another point on the circumference.

  • Not shocking at all: "We found the majority of catastrophic failures could easily have been prevented by performing simple testing on error handling code...A majority (77%) of the failures require more than one input event to manifest, but most of the failures(90%) require no more than 3." Really, who has the time? More on human nature in Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems.

  • Let simplicity fail before climbing the complexity ladder. Scalability! But at what COST?: "Big data systems may scale well, but this can often be just because they introduce a lot of overhead. Rather than making your computation go faster, the systems introduce substantial overheads which can require large compute clusters just to bring under control. In many cases, you’d be better off running the same computation on your laptop." But notice the kicker: "it took some work for parallel union-find." Replacing smart work with brute force is often the greater win. What are a few machine cycles between friends?

  • Programming is the ultimate team sport, so Why are Some Teams Smarter Than Others? The smartest teams were distinguished by three characteristics. First, their members contributed more equally to the team’s discussions. Second, their members can better read complex emotional states. Third, teams with more women outperformed teams with more men.

  • WhatsApp doesn't understand the web. Interesting design and discussions. Using proprietary Chrome APIs is a tough call, but this is more perplexing: "Your phone needs to stay connected to the internet for our web client to work." Is this for consistency reasons? To make sure the phone and the web stay in sync? Is it for monetization reasons? It does create a closed proxy that effectively prevents monetization leaks. It's tough to judge a solution without understanding the requirements, but there must be something compelling to impose so many limitations.

  • Roman Leventov analysis of Redis data structures. In which Salvatore 'antirez' Sanfilippo addresses point by point criticisms of Redis' implementation. People love Redis, part of that love has to come from what a good guy antirez is. Here he doesn't go all black diamond alpha nerd in the face of a challenge. He admits where things can be improved. He explains design decisions in detail. He advances the discussion with grace, humility, and smarts. A worthy model to emulate.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Agile Roles: What do product owners do other than make decisions?

 

The product owner role is anything but boring.

The product owner role is anything but boring.

The role of the product owner is incredibly important. The decision-making role of a product owner helps grease the skids for the team so that they deliver value efficiently and effectively. That said, there is more to the role than making decisions. In the survey of practioners (Agile Roles: What does a product owner do?) the next four items were:

      1. Attends Scrum meetings
      2. Prioritizes the user stories (and backlog)
      3. Grooms backlog
      4. Defines product vision and features

The product owner is a core member of the team. Participating in the Scrum meetings ensures that the voice of the customer is woven into all levels of planning and is not just a hurdle to be surmounted in a demo. When I was taught Scrum, the participation of the product owner was optional at the daily stand-up, in the retrospective and in more technical parts of sprint planning. Experience has taught me that optional typically translates to not present, and not present translates into defects and rework. Note, on the original list #15 was buy the pizza. I think the Scrum meetings are a good place to occasionally spring for pizza or DONUTS.

The backlog is ‚Äúowned‚ÄĚ by the product owner. The product owner prioritizes the backlog based on interaction with the whole team and other stakeholders. There are many techniques for prioritizing the backlog, ranging from business value, technical complexity, and the squeaky wheel (usually not a good method). Regardless of the method the final prioritization is delivered by the product owner.

As projects progress the backlog evolves. That evolution reflects new stories, new knowledge about the business problem, changes in the implementation approach and the need to break stories into smaller components. The process for making sure stories are well-formed, granular enough to complete and have acceptance criteria is story grooming. Grooming is often a small team affair, however typically the product owner is part of the grooming team. Techniques like the Three Amigos are useful for structuring the grooming approach.

Product owner interprets the sponsor’s (the person with the checkbook and political capital to authorize the project) vision by providing the team with the product vision. The product vision represents the purpose or motivation for the project. Until the project is delivered a vision is the picture that anyone involved with the project should be able to describe. Delivering the vision and vision for the features is a leadership role that helps teams decide on how to deliver a function. Knowing where the project needs to end up provides the team with knowledge that supports making technical decisions.

The product owner is leader, do’er, a visionary and a team member. As the voice of the customer the product owner describes the value proposition for the project from the business’ point of view. As part of the team the product owner interprets and synthesizes information from other team members and outside stakeholders. This is reflected in decision and priorities that shape the project and the value it delivers.

 


Categories: Process Management

As a DBA Expert, which database would you choose?

This is a guest post by Jenny Richards, a professional database administrator who is currently employed at Remote DBA.

In the world of databases, there is no single silver bullet fitting for every gun. How you select the database to use is very dependent on every other factor of your work: 

  • Who are you and what do you do? 
  • What is your end goal – what are you working to achieve?
  • How much data do you intend to store?
  • On what language and OS platforms do your applications run?
  • What is your budget?
  • Will you also require data warehousing, decision support systems and/or BI?
Background information
Categories: Architecture

Business Analyst Tip: Seilevel Approach Works

Software Requirements Blog - Seilevel.com - Thu, 01/22/2015 - 16:00
I still remember my first project at Seilevel, more than 9 years ago. I struggled with the information flood and came out of that project with at least one clear goal:¬† to get better at acquiring new information.¬† In October, I started on a new project with a new client in an industry new to […]
Categories: Requirements

How to Start an Agile Project

Making the Complex Simple - John Sonmez - Thu, 01/22/2015 - 16:00

In this video, I quickly outline how I would start an Agile project from the ground up. I go over the idea of having a single product owner who is responsible for the product, rather than a committee of stakeholders.

The post How to Start an Agile Project appeared first on Simple Programmer.

Categories: Programming

Building a Credible Performance Measurement Baseline

Herding Cats - Glen Alleman - Thu, 01/22/2015 - 03:03

Many times I hear about Cost of Delay, Deliver Value, Measure story points, or Measure Stories. And a myriad of other assessments of project performance, all of which - OK, most of which are examples of Open Loop Control.

Back in 2014, we had a paper in a publication of the College of Performance Management, starting on Page 17. As well, a colleague Nick Pisano (CDR US Navy Retired) has a post on the same topic at his blog.

Screen Shot 2015-01-21 at 4.59.00 PM

The notion of a baseline let alone a Performance Measurement Baseline is at the heart of Closed Loop Control of all processes, from your heating and air conditioning system in your house, to the flight controls on the 737-700 winging its way back home to Denver, to the project you're working - using what ever project management method or software development method of your choosing.

The notion that we can manage anything, the temperature of the room, the nice soft ride in the 737, or the probability of showing up on or before the need date, at or below the needed cost, with the needed capabilities - and NOT have a baseline to steer to is simply wrong. 

Below is the framework for Closed Loop control. This paradigm says simply:

  • State where we are going in units of time, cost, and technical performance:
    • I need this set of features (Capabilities) to be available for use by the customer on or before this date, with some confidence level, for some cost - again with a confidence level.
    • With these features - provided on a planned date, for a planned cost - I can then assess the progress toward that planned date, planned cost, and planned capabilities.
  • With the Planned data and the assessment of the actual data - cost, schedule, and technical performance:
    • Technical Performance is actually not enough
    • Measures of Effectiveness are needed
    • Measures of Performance as well
    • And other¬†...ilities of the outcomes - reliability, maintainability, serviceability, stability, etc.
  • Then with these measures we can generate an¬†error signal - between planned to date and actual to date - to determine several critical things - without which¬†we're flying Open Loop.
    • Given our ¬†Performance to Date and the Planned Performance at this point, how far behind are we, how over budget are we, how close are we to getting this gadget to work as needed?
    • With this data, we can then make a decision.
  • Making those decision means
    • ESTIMATING both the¬†to be¬†target - where should we be at this point in the project for cost, schedule and technical performance. And where should we be to close the gaps between our target and the actual progress to date.
    • ESTIMATING what cost, effort, changes in technical direction are needed to close those gaps.

ESTIMATING IS THE BASIS OF DECISION MAKING - it can't be any clearer than that.

Control systems from Glen Alleman Related articles Your Project Needs a Budget and Other Things The Actual Science in Management Science Probability and Statistics of Project Work What is Governance? Don't Manage By Quoting Dilbert
Categories: Project Management

Python/pdfquery: Scraping the FIFA World Player of the Year votes PDF into shape

Mark Needham - Thu, 01/22/2015 - 01:25

Last week the FIFA Ballon d’Or 2014 was announced and along with the announcement of the winner the individual votes were also made available.

Unfortunately they weren’t made open in a way that Ben Wellington (of IQuantNY fame) would approve of – the choice of format for the data is a PDF file!

I wanted to extract this data to play around with it but I wanted to automate the extraction as I’d done when working with Google Trends data.

I had a quick look for PDF scraping libraries in Python and R and eventually settled on Python’s pdfquery, mainly because there was lots of documentation which made it easy to get started.

One way you scrape data from a PDF is by locating an element on the page and then grabbing everything within a bounded box relative to that element.

In my case I had 17 pages all of which had a heading for each of six columns.

2015 01 22 00 08 18

I wanted to grab the data in each of those columns but initially struggled working out what elements I should be looking for until I came across the following function which allows you to dump an XML version of the PDF to disk:

import pdfquery
pdf = pdfquery.PDFQuery("fboaward_menplayer2014_neutral.pdf")
pdf.load()
pdf.tree.write("/tmp/yadda", pretty_print=True)

The output looks like this:

$ head -n 10 /tmp/yadda
<pdfxml ModDate="D:20150110224554+01'00'" CreationDate="D:20150110224539+01'00'" Producer="Microsoft&#174; Excel&#174; 2010" Creator="Microsoft&#174; Excel&#174; 2010">
  <LTPage bbox="[0, 0, 841.8, 595.2]" height="595.2" pageid="1" rotate="0" width="841.8" x0="0" x1="841.8" y0="0" y1="595.2" page_index="0" page_label="">
    <LTAnon> </LTAnon>
    <LTTextLineHorizontal bbox="[31.08, 546.15, 122.524, 556.59]" height="10.44" width="91.444" word_margin="0.1" x0="31.08" x1="122.524" y0="546.15" y1="556.59"><LTTextBoxHorizontal bbox="[31.08, 546.15, 122.524, 556.59]" height="10.44" index="0" width="91.444" x0="31.08" x1="122.524" y0="546.15" y1="556.59">FIFA Ballon d'Or 2014 </LTTextBoxHorizontal></LTTextLineHorizontal>
    <LTAnon> </LTAnon>
    <LTAnon> </LTAnon>
    <LTAnon> </LTAnon>
    <LTAnon> </LTAnon>
    <LTAnon> </LTAnon>
    <LTAnon> </LTAnon>

Having scanned through the file I realised that what I needed to do was locate the ‘LTTextLineHorizontal’ element for each heading and then grab all the ‘LTTextLineHorizontal’ elements that appeared in that column.

I started out by trying to grab the ‘Name’ column on the first page:

>>> name_element = pdf.pq('LTPage[pageid=\'1\'] LTTextLineHorizontal:contains("Name")')[0]
>>> name_element.text
'Name '

Next I needed to get the other elements in that column. With a bit of trial and error I ended up with the following code:

x = float(name_element.get('x0'))
y = float(name_element.get('y0'))
cells = pdf.extract( [
         ('with_parent','LTPage[pageid=\'1\']'),
         ('cells', 'LTTextLineHorizontal:in_bbox("%s,%s,%s,%s")' % (x, y-500, x+150, y))
    ])
 
>>> [cell.text.encode('utf-8').strip() for cell in cells['cells']]
['Amiri Islam', 'Cana Lorik', 'Bougherra Madjid', 'Luvu Rafe Talalelei', 'Sonejee Masand Oscar', 'Amaral Felisberto', 'Liddie Ryan', 'Griffith Quinton', 'Messi Lionel', 'Berezovskiy Roman', 'Breinburg Reinhard', 'Jedinak Mile', 'Fuchs Christian', 'Sadigov Rashad', 'Gavin Christie', 'Hasan Mohamed', 'Mamun Md Mamnul Islam', 'Burgess Romelle', 'Kalachou Tsimafei', 'Komany Vincent', 'Eiley Dalton', 'Nusum John', 'Tshering Passang', 'Raldes Ronald', 'D\xc5\xbeeko Edin', 'Da Silva Santos Junior Neymar', 'Ceasar Troy', 'Popov Ivelin', 'Kabore Charles', 'Ntibazonkiza Saidi', 'Kouch Sokumpheak']

I cleaned that up and generified it to work for any page and for columns of different widths. This is what the function looks like:

def extract_cells(page, header, cell_width):
    name_element = pdf.pq('LTPage[pageid=\'%s\'] LTTextLineHorizontal:contains("%s")' % (page, header))[0]
    x = float(name_element.get('x0'))
    y = float(name_element.get('y0'))
    cells = pdf.extract( [
         ('with_parent','LTPage[pageid=\'%s\']' %(page)),
         ('cells', 'LTTextLineHorizontal:in_bbox("%s,%s,%s,%s")' % (x, y-500, x+cell_width, y))
    ])
    return [cell.text.encode('utf-8').strip() for cell in cells['cells']]

We can then call that for each column on the page and zip together the resulting arrays to get a tuple for each row:

roles = extract_cells(1, "Vote", 50)
countries = extract_cells(1, "Country", 150)
voters = extract_cells(1, "Name", 170)
first = extract_cells(1, "First (5 points)", 150)
second = extract_cells(1, "Second (3 points)", 150)
third = extract_cells(1, "Third (1 point)", 130)
 
>>> for vote in zip(roles, countries, voters, first, second, third)[:5]:
       print vote
 
('Captain', 'Afghanistan', 'Amiri Islam', 'Messi Lionel', 'Cristiano Ronaldo', 'Ibrahimovic Zlatan')
('Captain', 'Albania', 'Cana Lorik', 'Cristiano Ronaldo', 'Robben Arjen', 'Mueller Thomas')
('Captain', 'Algeria', 'Bougherra Madjid', 'Cristiano Ronaldo', 'Robben Arjen', 'Benzema Karim')
('Captain', 'American Samoa', 'Luvu Rafe Talalelei', 'Neymar', 'Robben Arjen', 'Cristiano Ronaldo')
('Captain', 'Andorra', 'Sonejee Masand Oscar', 'Cristiano Ronaldo', 'Mueller Thomas', 'Kroos Toni')

The next step was to write out each of those rows to a CSV file so we can use it from another program. The full script looks like this:

import pdfquery
import csv
 
def extract_cells(page, header, cell_width):
    name_element = pdf.pq('LTPage[pageid=\'%s\'] LTTextLineHorizontal:contains("%s")' % (page, header))[0]
    x = float(name_element.get('x0'))
    y = float(name_element.get('y0'))
    cells = pdf.extract( [
         ('with_parent','LTPage[pageid=\'%s\']' %(page)),
         ('cells', 'LTTextLineHorizontal:in_bbox("%s,%s,%s,%s")' % (x, y-500, x+cell_width, y))
    ])
    return [cell.text.encode('utf-8').strip() for cell in cells['cells']]
 
if __name__ == "__main__":
    pdf = pdfquery.PDFQuery("fboaward_menplayer2014_neutral.pdf")
    pdf.load()
    pdf.tree.write("/tmp/yadda", pretty_print=True)
 
    pages_in_pdf = len(pdf.pq('LTPage'))
 
    with open('votes.csv', 'w') as votesfile:
        writer = csv.writer(votesfile, delimiter=",")
        writer.writerow(["Role", "Country", "Voter", "FirstPlace", "SecondPlace", "ThirdPlace"])
        for page in range(1, pages_in_pdf + 1):
            print page
            roles = extract_cells(page, "Vote", 50)
            countries = extract_cells(page, "Country", 150)
            voters = extract_cells(page, "Name", 170)
            first = extract_cells(page, "First (5 points)", 150)
            second = extract_cells(page, "Second (3 points)", 150)
            third = extract_cells(page, "Third (1 point)", 130)
            votes = zip(roles, countries, voters, first, second, third)
            print votes
            for vote in votes:
                writer.writerow(list(vote))

The code is on github if you want to play around with it or if you just want to grab the votes data that’s there too.

Categories: Programming

Learn from my pain - 5 Lessons from Ello's Adventures in Rapid Scaling

Within one week Ello went from thousands of sessions a day to a few million sessions a day. Mike Pack wrote a great article sharing what they’ve learned: 5 Early Lessons from Rapid, High Availability Scaling with Rails.

Some of their scaling challenges: quantity of data, team size, DNS, bot prevention, responding to users, inappropriate content, and other forms of caching. What did they learn?

  1. Move the graph. User relationships were implemented on a standard Rails stack using Heroku and Postgres. The relationships table became the bottleneck. Solution: denormalize the social graph and move hot data into Redis. Redis is used for speed and Postgres is used for durability. Lesson: know the core pillar that supports your core offering and make it work.

  2. Create indexes early, or you're screwed. There's a camp that says only create indexes when they are needed. They are wrong. The lack of btree indexes kills query performance. Forget a unique index and your data becomes corrupted. Once the damage is done it's hard to add unique indexes later. The data has to be cleaned up and indexes take a long time to build when there's a lot of data.

  3. Sharding is cool, but not that cool. Shard all the things only after you've tried vertically scaling as much as possible. Sharding caused a lot of pain. Creating a covering index from the start and adding more RAM so data could be served from memory, not from disk, would have saved a lot of time and stress as the system scaled.

  4. Don't create bottlenecks, or do. Every new user automatically followed a system user that was used for announcements, etc. Scaling problems that would have been months down the road hit quickly as any write to the system user caused a write amplification of millions of records. The lesson here is not what you may think. While scaling to meet the challenge of the system user was a pain, it made them stay ahead of the scaling challenge. Lesson: self-inflict problems early and often.

  5. It always takes 10 times longer. All the solutions mentioned take much longer to implement than you might think. Early estimates of a couple days soon give way to the reality of much longer time hits. Simply moving large amounts of data can take days. Adding indexes to large amounts of data takes time. And with large amounts of data problems tend to happen as you get to the larger data sizes which means you need to apply a fix and start over. 

This full article is excellent and is filled with much more detail that makes it well worth reading.

Categories: Architecture

We Need Planning; Do We Need Estimation?

As I write the program management book, I am struck by how difficult it is to estimate large chunks of work.

In Essays on Estimation and Manage It!, I recommend several approaches to estimation, each of which include showing that there is no one absolute date for a project or a program.

What can you do? Here are some options:

  1. Plan to replan. Decide how much to invest in the project or program for now. See (as in demo) the project/program progress. Decide how much longer you want to invest in the project or program.
  2. Work to a target date. A target date works best if you work iteratively and incrementally. If you have internal releases often, you can see project/program progress and replan. (If you use a waterfall approach, you are not likely to meet the target with all the features you want and defects you don’t want. If you work iteratively and incrementally, you refine the plan as you approach the target. Notice I said refine the plan, not the estimate.
  3. Provide a 3-point estimate: possible, likely, and worst case. This is PERT estimation.
  4. Provide a percentage confidence with your estimate. You think you can release near a certain date. What is your percentage confidence in that date? This works best with replanning, so you can update your percentage confidence.

Each of these shows your estimation audience you have uncertainty. The larger the project or program, the more you want to show uncertainty.

If you are agile, you may not need to estimate at all. I have managed many projects and programs over the years. No one asked me for a cost or schedule estimate. I received targets. Sometimes, those targets were so optimistic, I had to do a gross estimate to explain why we could not meet that date.

However, I am not convinced anything more than a gross estimate is useful. I am convinced an agile roadmap, building incrementally, seeing progress, and deciding what to do next are good ideas.

Agile Roadmap When you see this roadmap, you can see how we have planned for an internal release each month.

With internal releases, everyone can see the project or program progress.

In addition, we have a quarterly external release. Now, your project or program might not be able to release to your customers every quarter. But, that should be a business decision, not a decision you make because you can’t release. If you are not agile, you might not be able to meet a quarterly release. But, I’m assuming you are agile.

Agile Roadmap, One Quarter at a time In the one-quarter view, you can see the Minimum Viable Products.

You might need to replace MVPs with MIFS, Minimum Indispensable Feature Sets, especially at the beginning.

If you always make stories so small that you can count them, instead of estimate them, you will be close. You won’t spend time estimating instead of developing product, especially at the beginning.

You know the least about the risks and gotchas at the beginning of a project or program. You might not even know much about your MIFS or MVPs. However, if you can release something for customer consumption, you can get the feedback you need.

Feedback is what will tell you:

  • Are these stories too big to count? If so, any estimate you create will be wrong.
  • Are we delivering useful work? If so, the organization will continue to invest.
  • Are we working on the most valuable work, right now? What is valuable will change during the project/program. Sometimes, you realize this feature (set) is less useful than another. Sometimes you realize you’re done.
  • Are we ready to stop? If we met the release criteria early, that’s great. If we are not ready to release, what more do we have to do?

Here’s my experience with estimation. If you provide an estimate, managers won’t believe you. They pressure you to “do more with less,” or some such nonsense. They say things such as, “If we cut out testing, you can go faster, right?” (The answer to that question is, “NO. The less technical debt we have or create, the faster we can go.”)

However, you do need the planning of roadmaps and backlogs. If you don’t have a roadmap that shows people something like what they can expect when, they think you’re faking. You need to replan the roadmap, because what the teams deliver won’t be everything the product owner wanted. That’s okay. Getting feedback about what the teams can do early is great.

There are two questions you want to ask people who ask for estimates:

  1. How much would you like to invest in this project/program before we stop?
  2. How valuable is this project/program to you?

If you work on the most valuable project/program, why are you estimating it? You need to understand how much the organization wants to invest before you stop. If you’re not working on the most valuable project/program, you still want to know how much the organization wants to invest. Or, you need a target date. With a target date, you can release parts iteratively and incrementally until you meet the target.

This is risk management for estimation and replanning. Yes, I am a fan of #noestimates, because the smaller we make the chunks, the easier it is to see what to plan and replan.

We need planning and replanning. I am not convinced we need detailed estimation if we use iterative and incremental approaches.

Categories: Project Management

Continuous Delivery across multiple providers

Xebia Blog - Wed, 01/21/2015 - 13:04

Over the last year three of the four customers I worked with had a similar challenge with their environments. In different variations they all had their environments setup across separate domains. Ranging from physically separated on-premise networks to having environments running across different hosting providers managed by different parties.

Regardless of the reasoning behind having these kinds of setup it‚Äôs a situation where the continuous delivery concepts really add value. The stereotypical problems that exist with manual deployment and testing practices tend to get amplified when they occur in seperated¬†domains. Things get even worse when you add more parties to the mix (like external application developers). Sticking to doing things manually is a recipe for disaster unless you enjoy going through expansive procedures every time you want to do anything in any of ‚Äėyour‚Äô environments. And if you‚Äôve outsourced your environments to an external party you probably don‚Äôt want to have to (re)hire a lot of people just so you can communicate with your supplier.

So how can continuous delivery help in this situation? By automating your provisioning and deployments you make deploying your applications, if nothing else, repeatable and predictable. Regardless of where they need to run.

Just automating your deployments isn’t enough however, a big question that remains is who does what. A question that is most likely backed by a lengthy contract. Agreements between all the parties are meant to provide an answer to that very question. A development partner develops, an outsourcing partners handles the hardware, etc. But nobody handles bringing everything together...

The process of automating your steps already provides some help with this problem. In order to automate you need some form of agreement on how to provide input for the tooling. This at least clarifies what the various parties need to produce. It also clarifies what the result of a step will be. This removes some of the fuzziness out of the process. Things like is the JVM part of the OS or part of the middleware should become clear. But not everything is that clearcut. It’s parts of the puzzle where pieces actually come together that things turn gray. A single tool may need input from various parties. Here you need to resists the common knee-jerk reaction to shield said tool from other people with procedures and red tape. Instead provide access to those tools to all relevant parties and handle your separation of concerns through a reliable access mechanism. Even then there might be some parts that can’t be used by just a single party and in that case, *gasp*, people will need to work together. 

What this results in is an automated pipeline that will keep your environments configured properly and allow applications to be deployed onto them when needed, within minutes, wherever they may run.

MultiProviderCD

The diagram above shows how we set this up for one of our clients. Using XL Deploy, XL Release and Puppet as the automation tooling of choice.

In the first domain we have a git repository to which developers commit their code. A Jenkins build is used to extract this code, build it and package it in such a way that the deployment automation tool (XL Deploy) understands. It’s also kind enough to make that package directly available in XL Deploy. From there, XL Deploy is used to deploy the application not only to the target machines but also to another instance of XL Deploy running in the next domain, thus enabling that same package to be deployed there. This same mechanism can then be applied to the next domain. In this instance we ensure that the machines we are deploying to are consistent by using Puppet to manage them.

To round things off we use a single instance of XL Release to orchestrate the entire pipeline. A single release process is able to trigger the build in Jenkins and then deploy the application to all environments spread across the various domains.

A setup like this lowers deployment errors that come with doing manual deployments and cuts out all the hassle that comes with following the required procedures. As an added bonus your deployment pipeline also speeds up significantly. And we haven’t even talked about adding automated testing to the mix…

Small Basic on Mac & Linux

Phil Trelford's Array - Wed, 01/21/2015 - 09:29

Microsoft’s Small Basic is a simple programming language and environment aimed at beginners.

It ships with an IDE for Windows, a commands line compiler and a small .Net library. Small Basic programs can also be run in the browser on Windows & Mac via SIlverlight.

The shipped .Net library for Small Basic targets WPF for graphics which is unfortunately not supported on Mono, which means Small Basic apps will not run directly on Mac or Linux.

To get Small Basic apps running from the command prompt on Mac and Linux all that is needed is a new library is required without the WPF dependency.

Recently I knocked up such a library providing support for command line input and output, providing graphics is a work-in-progress.

But this does mean I can now write and run FizzBuzz, or even work through the majority of the Small Basic Tutorial, on Linux or Mac via Mono:

Small Basic on Mac

Combine this with my open source Small Basic compiler project (written in F#) and there’s now have a cross platform version of Small Basic :)

If you fancy having a play with an early version of the source download it here: http://trelford.com/SmallBasicLibrary2012.zip

Future work

I’m currently evaluating GtkSharp, OpenTK and WinForms as options for a cross platform version of the graphics library.

As well as the compiler, I’ve also written an interpreter for Small Basic which means it should be possible to edit and run programs on iOS and Android, but that’s another story…

Categories: Programming