July 23, 2021

Preparing to Work with DateTime Information

Tomasz Lelek

Jon Skeet

DURATION

21min

Limiting your scope

The world of date/time information can get bewilderingly complicated. The good news is that you probably don’t need all that complexity in your application. When you start planning either a whole application or an individual feature that uses dates and times, it’s worth explicitly trying to limit the scope of your work, and document the decisions you make.

You can probably start off by ruling out the most complex and niche aspects:

Does your application need to deal with relativity?
Do you need to be aware of and account for leap seconds?
Do you need to work with dates that are sufficiently far in the past that historical calendar system changes might be relevant?

If the answer to any of these questions is “yes” then you may find that you’re limited in terms of the libraries that you can use, and you’ll definitely want to take even more care than usual and do plenty of research into the niche you’re stuck with. I don’t have much more specific advice than that, as I’ve never had to work in that sort of application, but I would expect that choosing appropriate types to represent product concepts is even more important than normal.

The second level of complexity to think about is around calendar systems and time zones.

Do you need to work with any calendar system other than the Gregorian calendar? Most business applications can probably just use the Gregorian calendar, but there will certainly be counterexamples, particularly if your application’s audience is a religious community that pays particular attention to a specific calendar. Consumer applications are slightly more likely to need support for the preferred calendar system of the user, but you probably want to weigh up the costs and benefits of doing so before committing. (The benefits will be application-specific, and the costs may well be technology-specific; support for non-Gregorian calendar systems varies significantly.)

The level of complexity around time zones can vary significantly. Questions to ask yourself here include:

Does the product need to support time zones at all? Sometimes an entire application can be built around the “machine time” concepts, which can simplify things a lot.
Does the product need to interoperate with time zones specified by another system? If so, which time zone database does it use?
Does the product need to allow users to choose time zones, or can you just rely on detecting their default time zone?
Does the product need to work in more than a single time zone? If so, are you confident it will stay that way?
Does the product need to keep absolutely up-to-date in terms of time zone rules, actively keeping track of changes, or can it just use the time zone rules that come by default with the platform or library?
Does the product need to store any data that naturally includes time zone information, or is any time zone interaction purely for display purposes?
How much attention do you need to pay to time zone transitions, in terms of skipped and ambiguous times? If you’re writing a school timetable system for example, it’s unlikely that pupils will have lessons at the time of a transition.

Most applications that need to display date/time values to a user will need some time zone awareness, but you may well be able to make your life much simpler by not building in more flexibility than you need. There’s a trade-off here, of course: if you write your code with the assumption that you’ll only ever need to work with (say) the time zone for Paris, you may well find it’s quite hard to undo the impact of that assumption later on. It really can make a large difference in terms of simplicity though. One way to mitigate that risk of future requirements is to make sure everyone on the team is aware of the assumptions that are being made, and reflects on when they’re relying on them. Keeping a document of places in the system where the assumptions are relevant can make it much easier to backtrack later on.

This sort of scoping is usually possible before you have detailed product or feature requirements. It would be relatively rare for a product to unexpectedly change to needing to support multiple calendar systems, for example. (It’s possible, of course. That sort of new requirement is more likely to be part of an expansion into new markets than as part of adding a new individual feature.) The developers in the team can probably work through the questions above themselves, then document and validate the results with the product owners.

Terminology: product owners

I’m using the term “product owners” to represent “the people who are responsible for deciding what the product should do”. Different companies may use different names, such as product managers. Depending on your exact development model, these may be people within the same company as the developers, a different company, or a mixture. They may be the developers themselves, but it’s worth treating this as a role that’s separate from deciding how a product should be implemented.

When it comes to detailed requirements, however, the product owners must be involved.

Clarifying date/time requirements

I should start this section with a warning: ensuring that the product requirements around date/time work are clear and unambiguous is unlikely to make you popular. You’re likely to be faced with many responses of “isn’t it obvious?” even if the obvious answer for one person is different to the obvious answer for someone else. But the effort is worth it. Once the requirements are clear, the coding is often straightforward. Without clear requirements, you may well find that each individual involved in the product has different expectations, leading to chaos.

Exactly how you decide to plan and document your requirements is up to you, of course. There’s no particular required methodology. You may have a “big up-front design” or you may be designing individual small features as you go in a more agile approach. It’s worth being careful in the “only design what you need right now” style though: if you only need a date for a particular piece of information in the first sprint, but then find you need a date and time (and maybe time zone) by the time you reach sprint four, that will make life significantly harder. Try to anticipate future natural requirements to some extent, without going too far down the rabbit hole of planning for every possible eventuality.

There are broadly two kinds of decision that should be recorded as part of the requirements documentation: how you’re treating each piece of date/time-related data, and how you operate on them. You’ll also need to consider representations for storage and transmission, but those are more implementation details than product requirements. The two kinds of decision are related, but we’ll consider them separately.

Figure 7. A high level requirement that needs more details

To try to make everything concrete, we’ll use an online shopping scenario to start with. The TL;DR of the requirement we’ll look at is “customers can return items within 3 months”. By the end of the scenario, we’ll have a set of requirements which can be implemented and tested.

Picking the right concepts or data types

Good product requirements usually state what information is collected in a given situation, and potentially what information is deliberately not being collected. Sometimes this is implicit, somewhat buried within a narrative describing the user journey, but it’s clearer if it’s called out explicitly. It’s usually easy to spot date/time-related information, but it can be harder to decide how you’re going to treat that data.

As a first rule of thumb, it’s worth considering the source of the data. If you’re recording that “something has happened” then you should usually start off with an instant - the instant at which the event occurred. You may also want to record a time zone (or more generally a location) if that’s going to be relevant to other operations. Recording the instant is usually straightforward - most databases and logging systems have built-in timestamps.

Note: Whose “now” is it anyway?

You may need to consider what source of “current time” is important: if you capture “now” in both the database and on a separate web server, the two clocks involved may not be perfectly synchronized. Whether or not that’s important will depend on your application.

If you’re recording a date/time value that is provided by a user, that’s a different matter. You’re in the realm of civil time rather than machine time at that point - even if they’re reporting when something happened. You almost certainly need to bear time zone information in mind, or at least a UTC offset. You may be tempted to convert that into an instant, but I’d encourage you to retain “exactly what the user gave you” - or at least a representation which is parsed, but not necessarily transformed. The approach of “just store UTC” can go wrong, particularly when recording information about the future.

For our customer returns requirement, we obviously need to capture some information, but it’s not immediately clear what that information should be, let alone what representation to use. The first question to ask of the product owner is “customers can return items within 3 months of what?” For example, it might be:

Within 3 months of the user clicking “pay”
Within 3 months of the payment being accepted
Within 3 months of the order being confirmed
Within 3 months of the stock being allocated
Within 3 months of the order being shipped
Within 3 months of the order being received

We’ll be thinking about what “3 months” means later on, but the list above shows six different instants in time. Even within the fifth bullet of “shipped” there may be several different instants, but for simplicity we’ll assume we can agree on one of those being the relevant one.

Importantly though, these are all instants in time, and it would make sense to record them all within the order. Some aspects may be on a per-item basis rather than a per-order basis, such as stock allocation or even shipping - the order may be shipped in multiple deliveries. All of these are aspects that the product owner should be considering in the context of “customers can return items within 3 months”.

Let’s assume the product owner replies that for any given item, the customer can return that item within 3 months of it shipping. (So the returns window may vary between items even within the same order.) Great - this is already a lot more precise.

We’ll probably be recording various other instants, but we know we need to record the instant at which each item was shipped. That’s still not the final solution though. We know that “3 months” is a period, not a duration - and you can’t add a period to an instant. We’re going to have to derive some other information from that instant, in order to consider it in civil time. That means we have to consider calendar systems and time zones.

Tip: Store canonical information

We all know that product requirements can change. The decision of “the shipping time determines the returns window” could change - and so could the decisions we ask later on. If you keep all the “raw” information from the start, that allows you to change your decision later on. That means we should record all of the instants listed earlier… and store them as instants even if we later derive more information from them.

This is related to the earlier tip about retaining “what the user gave you” is important if the user specifies a date and/or time. The canonical information in that case isn’t an instant as recorded by a machine clock - it’s the user input.

First, we can ask the product owner what calendar system we should be using. This is likely to be a simple one: the Gregorian calendar system, regardless of the user. (If the product owner gives any other answer at that point, you should probably allow for a lot more testing time.)

Second, we can ask the product owner what time zone they’re interested in. This is where it’s useful to have a specific example to hand, in order to keep things concrete. You might want to give a scenario of:

A web server in Brazil,
… storing data in a database in New York
… placing an order for a company based in California
… shipping items from a warehouse in Texas
… for a customer with a billing address in Berlin, Germany
… shipping to an address in Sydney, Australia

The instant at which the item is considered “shipped” will represent different local times and possibly even different dates in each of those places. So what’s important here? One big hint: it almost certainly shouldn’t be the web server or database. Just about any other answer is plausible, but products should almost never behave differently based on the physical location of the computers involved, unless the users are sitting in front of those computers.

Even if the product owner thinks that’s a far-fetched situation, they should be able to decide what the right answer is, and document that decision. It also naturally forms the starting point of an acceptance test.

Let’s suppose the product owner answers that the relevant time zone is the one we’re shipping to - so Sydney, Australia, in this case. Fantastic. That probably doesn’t mean we need to store any more information: we’ve already got the location we’re shipping to (from which we can derive the time zone), the instant at which the item shipped as a canonical starting point, and the “always use the Gregorian calendar” decision from earlier on. We can convert the instant into a local time at the shipping location whenever we want to. It may be useful to store that directly in the database, but that’s an implementation detail.

With that information in hand, we can move on to the rest of the questions about this feature.

Asking questions about behavior

The broad statement of “customers can return items within 3 months” needs all kinds of clarifications. We’ve identified the starting point of that 3 months, but there’s still a lot more detail required before we can start implementing anything. Of course, any product owner doing their job properly would put in a lot of that detail into the requirements naturally, but we’re focusing on the date/time-related details.

Suppose the actual user journey documented is along these lines:

When viewing a completed order on the web site, any item that was shipped less than 3 months ago is displayed with an option to return the item. When the customer clicks on that option, they are presented with a form containing the details for the return. Once they have completed the form, the returns procedure is initiated.

There would be a lot of detail about the returns procedure, but there are two date/time aspects that need clarifying here.

Firstly, should the “3 months” apply to when the user viewed the completed order, when they clicked on the option to start the returns process, or when they submitted the returns form? Those are three different instants in time. It would be irritating for a customer if they viewed the order when it was valid to return the item, but if they then clicked on the returns option a minute later, the web site said it wasn’t valid any more. On the other hand, we don’t want a loophole where a user can leave a browser window up for years and effectively have an unlimited returns period. The same question could apply for completing the returns form.

Here’s one possible set of requirements with more details:

When viewing a completed order on the web site, any item that was shipped less than 3 months ago is displayed with an option to return the item.

When the customer clicks on that option, the server checks whether the returns option was valid 5 minutes earlier, and returns an error if it wasn’t. That allows customers a delay of up to 5 minutes between viewing the order and starting the returns process, when we guarantee to honor the return. (It also means that if the customer waited for more than 5 minutes but they’re still within the returns period anyway, they can still proceed to the returns form.) If the check passes, a returns form is presented to the customer. The form states that it must be completed within 2 hours.

When the returns form is submitted, the server checks that the returns procedure was started within the last 2 hours, and returns an error if it wasn’t. If the check passes, the form is submitted for processing and a confirmation screen is shown to the customer.

This has two different kinds of time limit: one that provides a sort of grace period of 5 minutes beyond the strict “you must start the returns process by time X” and a second which limits how long you can spend on the returns form itself.

We’re now half way to a good set of requirements from a date/time perspective. There’s still the gnarly bit about “less than 3 months ago” though. We’ve already decided that the starting time of the 3 months is “the instant when the order was shipped” and that the 3 months should be oriented around the time zone of the delivery address. There’s still a bit of work to do in terms of precision, however.

Arithmetic involving calendars doesn’t follow the same rules as we’re used to with regular math. So in this case, we need to differentiate between “taking the shipping time and adding 3 months” and “taking the current time and subtracting three months”. The product owner will also need to work out what they want to do about granularity: if something ships at 10am, do they want the three months to run out at 10am three months later? That could feel a little arbitrary to customers. If it’s what the product owner decides, of course, then that’s the requirement. But here’s the sort of requirement I would probably write if I were a product owner:

The option to return an item is based on the date on which the item was shipped, in the time zone of the delivery address. The “last date on which a return is valid” is calculated by adding 3 months to the current date at the delivery location when the item is shipped. If adding 3 months to the shipping date goes beyond the end of the month, the start of the next month is used. (Example: if an item ships on November 30th, the last valid return date is the following March 1st, not the last day of February.) The “return an item” option is shown to the customer so long as the current date at the delivery location is not later than the last valid return date.

That’s quite wordy, but it’s unambiguous. It covers:

The granularity we’re using (date, not date and time)
The nature of the calendar arithmetic (adding to the start date)
The nature of the check (the last date is inclusive)
The time zone involved (the delivery address)
The way in which the calendar arithmetic is resolved (roll over to the start of the next month)

That last requirement may not be the simplest one to code, depending on the library you’re using, but at least it’s clear and testable.

I wouldn’t expect a product owner to come up with requirements like that on their own, unless they happen to have done date/time work like this before. Until you’re aware of the oddities of calendar arithmetic, the potential ambiguities aren’t always obvious. But that’s where the development team can probe the requirements until they’re precise enough. The process of going from a vague set of product requirements to a specific, unambiguous, testable set of requirements will vary depending on how your team is set up, but it’s important to get there in the end. It may require multiple rounds of asking questions involving awkward corner cases, or the development team may be able to suggest a more concrete version of the ambiguous requirements.

The final step before you start writing code is to make sure you’re using the right tools for the job.

Using the right libraries or packages

While it’s possible to write clear, readable code using poor date/time libraries, it’s an uphill struggle. Once you’ve got a clear set of requirements, you’re in a good position to evaluate the technologies to use to implement them.

This is a landscape that changes over time. For example, at the time of writing, the “Temporal” proposal for a new set of standard objects for working with dates and times in JavaScript is only a draft, but if and when it’s approved, that’s likely to be an option you’d want to consider for new JavaScript projects.

We’re happy to provide recommendations for Java and .NET, as these are the platforms the authors know best, and they’re both quite stable in terms of options. Of course, there’s always the possibility that something new will have become available between the time we write this text and the time you read it, but they’re at least good starting points.

On the Java platform, if you can possibly use the java.time package introduced in Java 8, you should do so. If for some reason you’re stuck on Java 6 or Java 7, the ThreeTen-Backport project (https://www.threeten.org/threetenbp/) is a good alternative. The main objective is to avoid using java.util.Date and java.util.Calendar, both of which are full of traps waiting to lure the unwary developer into writing buggy code.

On .NET, our heavily biased recommendation is to use Noda Time (https://nodatime.org). The built-in types (DateTime, DateTimeOffset, TimeZoneInfo, TimeSpan) certainly can be used effectively, but they don’t separate out the different logical concepts we looked at earlier into different types. For example, there’s no type to represent “a date” and the same type is used for both the “duration” and “time of day” concepts. This means that it’s easy to write code that looks correct, but effectively performs invalid operations on the logical data, such as adding half an hour to a date. The way that a DateTime can mean “in some unspecified time zone”, “in the system local time zone” or “in UTC” doesn’t help, either.

Beyond these specific examples though, there are more general questions you can evaluate against any given library for your platform:

If you need to handle non-Gregorian calendar systems, does the library support those calendar systems?
Does the library provide enough control of the time zone data it uses? (For example, if you need to work with IANA time zone IDs, it’s best not to choose a library that only supports Windows time zones.)
Does the library support all the concepts you’ve identified in your requirements, providing sufficient distinctions between those concepts to help your code express your intentions clearly?
Does the library provide immutable types? While immutability as a general concept has distinct pros and cons, in the context of a date/time library it’s almost always a good thing.
Do your external dependencies (databases, other libraries, network APIs and the like) already lead in the direction of a particular library? If you need to perform conversions between different representations, is that easy to do?

Wherever possible, it’s useful to try prototyping some of the date/time requirements against the candidate library, so you’ll have an idea of what your final code will feel like. This can usually be done in a small console application or unit test project isolated from any existing application code. For example, with the requirements around item returns described earlier, I’d probably write some unit tests to check the logic for whether or not to show the “return item” option. If you’re evaluating multiple libraries, you may be able to have a single set of test cases that are then implemented using different libraries. Once you’ve got working code using all the libraries, you can compare the implementations for readability.

Once you’ve documented application-wide requirements, worked with the product owner on feature-specific requirements, and chosen a good library to use, finally you can start writing your production code.

That’s all for this article.

If you want to learn more about the book, check it out on our browser-based liveBook platform here.

From Software Mistakes and Tradeoffs by Tomasz Lelek and Jon Skeet.

35% discount code (good for all our products in all formats): nltopcoder21

Looking to Earn?

Check out Topcoder Freelance Gigs

Chat on Discord

July 23, 2021

Preparing to Work with DateTime Information

DURATION

categories

Tags

share

FREELANCE OPPORTUNITIES

Topcoder SKILL BUILDER COMPETITIONS

Join Topcoder Challenges

COMPETITIVE PROGRAMMING AT TOPCODER

Check out Topcoder Flutter Freelance Gigs.

Limiting your scope

Clarifying date/time requirements

Picking the right concepts or data types

Asking questions about behavior

Using the right libraries or packages

Looking to Earn?

Recommended for you

Ten Tips for Data Analysis

Flutter - COVID-19 App Using Cloud Firestore and Rest APIs

Creating Interactive Dashboards using Plotly Dash