The standard for the altruistic exchange of nonclinical data

This will be my final blog post of 2021, as I’ll be taking a break over the festive season. The season of goodwill is very apt for what’s on my mind this week.

Whether it’s media reports of how COVID-19 vaccines were collaboratively developed so rapidly; or the rise of consortia for exchanging pharmaceutical data for the purposes of accelerating drug development; it seems that data exchange is a hot topic and a potential game changer within our industry.

This week, I heard an impromptu address from Instem’s own CEO, Phil Reason, on this topic. Now I wouldn’t normally mention such things in my blog, as that’s all ‘a little bit too corporate’ but there was a real sense of altruism in the idea that we should be providing the technology to allow pharmaceutical organizations to easily exchange data and research to accelerate the development of new treatments. Our role, even our moral responsibility, is to ensure that that technology doesn’t become a barrier to such things and instead we should be striving to provide such capabilities.

As you can imagine, for someone who has dedicated the last 9 years of their career to developing and rolling out the Standard for the Exchange of Nonclinical Data, the topic really resonated with me, and my mind immediately went to thinking about the role of SEND.

I’m not claiming to have any answers here, but I do have lots of questions for the thought leaders within our industry and how they’d address this type of data sharing.

One thing that I’ve often heard thrown at SEND, is that its value is reduced as it only includes the individual raw data, and not the study conclusions. I’ve heard tales of individuals and organizations using SEND custom domains to list the study’s significant findings, but this is not the same as conclusions, and certainly the SEND custom domain structure wouldn’t seem a good mechanism for handling conclusions. So, would we need to supplement the SEND package in some way? What would that look like?

Which data are the most valuable for such altruistic exchanges? I don’t mean the idea that the Lab Results are more useful than Clinical Signs, I’m wondering about the value of nonclinical safety assessment studies, when compared to early discovery data, or later clinical trials. I mean, are these all equally valuable?

I’m also thinking about the time and effort it takes to convert data to SEND. Does that prohibit data from being exchanged, or at least makes it a little unpalatable? I mean, it’s one thing to go to the time and expense to convert a study to SEND for the purposes of a regulatory submission, but it’s another when it is simply to support someone else’s research and potential break through.

There are many other questions and topics buzzing around my head when thinking about the practicalities of such data exchange, but I’m inspired by the possibilities of finding new treatments and getting them to patients far more rapidly.

Festive greetings and I’ll see you again in the new year.

‘Till next time


The relative speed of SEND development

As we approach the holiday season and the eventual end of another year, I’m again thinking of how fast this year has passed. Also remembering childhood times when Christmas seemed to take forever to arrive. Now it seems I barely catch my breath and another year flies by. So yes, time is most definitely relative. As I look out of the window at the first dusting of snow, it seems truer than ever that the rate at which time passes, very much depends on your frame of reference.

The same is true of the world of SEND. I often hear the idea that SEND is moving too quickly. We barely get time to implement one change when another arrives. We are still adopting one standard when another is being forced on us. I mean, I know various organizations are still battling with the tighter control of the Microscopic Findings introduced in SEND 3.1, yet we already have SEND 3.1.1 published and now CDISC are talking about the SEND 3.2 with a whole raft of new domains and concepts to adopt.

I hear that point of view and while I have some sympathy for it, for me, SEND seems to move at an almost glacial pace. Take SEND 3.1.1 as an obvious example. This minor update addresses a crucial issue found with the Pharmacokinetics (PC) domain in SEND 3.0. Back in 2017 a small, expert team assembled to address this. Very quickly the solution was defined, yet it took until 2021 just to be published and still is not referenced on the Data Standards Catalogue.

In another example, as I’ve discussed here before, I joined CDISC back in 2012 to help work on the SEND standard for Reproductive Toxicology. It will be 11 years later before it finally becomes required in 2023.

Nervous System data is another area of disappointment. The domain was drafted back sometime around 2016 but it didn’t make the cut for SEND 3.1. Though it has examples, Controlled Terminology and Conformance Rules, it’s not making the cut for SEND 3.2 either. It’s currently scheduled for a tentative late 2025 publication, meaning we are looking at 2029 as the earliest that Functional Observational Battery and other CNS tests are going to be required in SEND. For an effort that began prior to 2016, this seems an unbelievably slow rate of progress.

The examples cited here are just some of many. My organization invests a huge amount of volunteer hours in CDISC because we honestly believe that the development of the SEND standard really does fit with our company vision to help our customers bring life enhancing products to market faster. We currently have more than 20 individuals giving time to CDISC, many of whom are actively leading teams. So, I personally feel the frustration of the snail-pace progress of SEND, but, like the changing of the seasons, the pace of progress is very much dependent on your frame of reference.

Till next time,


FDA provides more detail on the Scope of SEND

This past week was the CDISC SEND Virtual Face-to-Face meeting, and yes, I still snicker like a schoolboy at calling something a ‘virtual face-to-face’, but hopefully it’s the last one and by next Spring we can be face-to-face in person. Anyways, as is usual, the highlight of the week was the FDA public meeting. As you may imagine, this time it was dominated by discussions around the recent addition to the Technical Conformance Guide (TCG), momentously entitled the “Scope of SEND”.

I previously wrote about this when the FDA mentioned it at last year’s public meeting, and again when the recent TCG dropped. Then last week, the agency took the opportunity to take it apart, explaining their thinking and adding clarity.

The public meeting started with a reference back to the binding guidance. To serve as a reminder, that this is not optional, or subjective, and directly flows from that oh-so-important 2014 Guidance. Reinforcing the message that the recent TCG update is not adding anything new, it’s simply clarifying the position that harks back to that 2014 publication.

It was this time last year that the FDA clearly stated to industry that, although the Technical Rejection Criteria references the eCTD sections, the guidance doesn’t limit SEND requirements in the same way. A year later, we get a very detailed insight.

Without going into excessive detail, the guiding principle is this:

  • If the study starts after the requirement date, and;
  • The study needed to assess and support the safety of clinical investigations, and;
  • The study data can be represented in SEND;

Then the study requires SEND for submission.

Things like the age of the subjects, whether or not the study is GLP and the study report status were all discussed. It was clearly stated that these things do not affect whether or not the study requires SEND.

For some sponsors, this now causes a significant issue for certain studies. They may have studies already completed by their CRO, that at the time, incorrectly assumed that the study wouldn’t require SEND because it was in an eCTD section not covered by the Technical Rejection Criteria. So, they may have completed studies without SEND, which actually require SEND. However, fortunately for them, specialist SEND organisations like Instem can take completed studies and create SEND from the study report; allowing them to be compliant, even if they didn’t realise that SEND would be required when they initiated the study.

The FDA presentation also went to great lengths to talk about the value of SEND for regulatory purposes.  They described its value for primary reviews, and how it also assists in secondary and tertiary reviews. They described how SEND is “…used to conduct reviews that will impact regulatory decisions”.

Actually, it’s really inspiring hearing this, at a CDISC meeting focused on driving the SEND standard forward. The general feeling in the (virtual) room was really appreciative of the FDA’s openness in this way.

Till next time,


What’s the deal with the eCTD Study Tagging File (STF), SEND and the Technical Rejection Criteria?

As you are probably aware, Instem recently acquired PDS Life Sciences and so I’ve been working very closely with my new colleague, Mike Wasko. Mike is a prominent figure within our industry, and this week as we were catching up, we started discussing the question: What is the relationship between the eCTD Study Tagging File (STF) and SEND datasets and the impact on the Technical Rejection Criteria (TRC)? As this is an area that causes some confusion and can lead to an automated Technical Rejection of the submission, I asked Mike to write up his thoughts, so they could be shared in this blog. Here’s what he had to say:

As outlined in the Technical Conformance Guide (TCG), TRC and eCTD Guidance, the study-id used in the STF must match either the STUDYID or the Sponsor’s Reference ID (SPREFID) value in the TS domain. Additionally, this study-id must be present in the define.xml file in the StudyName element. This connection is clearly understood; however, knowing what study-id the Sponsor will use in the STF is another story.

Usually, study conduct and then the subsequent SEND dataset creation, are managed by the Study Director and a SEND team. Whether the SEND team is internal to the CRO or external, the SEND team typically has little to no contact with the submission team creating the STF.

Now that the FDA’s TRC is in effect (September 15, 2021), the connection between the SEND datasets and STF has become essential and communication regarding the study-id in the STF must adapt. In an ideal setting, the team creating the SEND dataset and the associated define.xml file, will be informed of the study-id to be used in the STF.

One would think that the study identifier should be obvious, but that is not always the case. This coupled with the fact that the STF is usually created after the SEND datasets are completed creates a unique problem. For studies conducted by CROs, a study might have several different identifiers, the Sponsor study id, the CRO study id, a report number or even something else. In this situation, it is difficult  for the SEND team to determine which identifier will be used in the STF, and remember that wrong assumptions will cause a technical rejection.

Even if a single study number is used in the report and SEND dataset package, another use case can cause a challenge. The eCTD and STF limit the type of characters that can be used due to folder structure. Special characters such as “/”, “\”, “.” and others should not be used. If the study report and protocol use such characters for the study number, how will it be represented in the STF? Was the character simply removed? Was the character replaced with an acceptable character such as “_” or “-”?

For submissions that occur near the time of a SEND dataset generation, proactive communication from the submission team and can resolve the issue. However, consider that SEND datasets can sit for years before submission say for an NDA. Staff may have changed or maybe the agreement wasn’t well documented. Now the Sponsor will need to ensure that the STF and the STUDYID or SPREFID align.  If they do not, the ts.xpt and define.xml files will have to be updated accordingly.  Currently, we are not aware of any validators that do this check automatically so Sponsors will also need to know where to look in the define.xml and ts.xpt to ensure the study-id specified in the STF matches the dataset and define.xml file. As time progresses, it seems more and more people will need to become SEND aware as it is no longer strictly a SEND team and Study Director responsibility.

I thought that was a great explanation for Mike to share, and you can expect that I will be asking Mike for future contributions as we continue to navigate the challenges and opportunities that SEND presents.

If you’d like to continue this conversation, drop me a line at

Till next time,


Is SEND the key to unlocking Historical Control Data?

“You wait all day for a bus and then 3 come along at once.” It’s a phrase I used to hear a lot in my younger days when I would often ride public transport. There’s been some of that going on this week, though not with busses. For me, this week it’s been the role of SEND in Historical Control Data (HCD) systems.

It must have been a year or two since this topic last cropped up, but this week, completely coincidentally, it’s been raised several times by different people in completely different contexts. When that topic is raised, immediately people start considering the role of standardized data, and the possibilities SEND brings.

Any potential HCD system has three components:

  1. A large wealth of data to draw on
  2. A way of harmonizing those data, particularly when the data come from different sources utilizing different structures and terms
  3. A tool to query, aggregate and visualize the data

As an organization that for many years was simply a software vendor, tool development and data harmonization would have sat well within our comfort zone. Yet without being able to draw on a significant volume of electronic data, there wasn’t much value in developing such tools.

The SEND standard is now opening that up as a possibility. This is because more data are becoming available since CROs are no longer just supplying the PDF study report, but also providing standardized electronic data.

Standardized data means consistent data, regardless of CRO or data collection system. That’s the idea that really opens up the possibilities for HCD systems. That final stumbling block to the value of a system, is now overcome.

As well as some organizations drawing on their own data, others are considering the possibility of pooling their control data. It’s an intriguing possibility. Some SEND tools, like Instem’s SEND Explorer, already have build-in visualizations for querying Historical Control Ranges. These would provide far more value when hooked up to such a vast database. This then came with questioning if there’s need for independent curation of the data, and maintenance of the database.

Anyways, having not thought about HCD for a while, I was asked about it in the context of our own data collection systems, then a query about SEND Explorer’s functionality, and then the possibility of data curation. “Just like busses, three come along at once.”

Till next time,


Is the latest Technical Conformance Guide update, the most important to date?

It was late 2020, during the FDA public webinar, as part of the CDISC face-to-face meeting, that the agency made the simplest of statements, which seemed to turn the world of SEND upside down:

The placement of a study into the eCTD format does not determine the SEND requirement

It was just one little line sitting innocuously amongst many others. Online chat lit up. For years, the CDISC fraternity had proclaimed the eCTD section, together with the study start date, as the key foundations for determining, absolutely, if a study required SEND datasets or not.

On that October morning, without fanfare or forewarning, the agency let us all know, that simply was not the case. What followed was a period of questions and honestly, confusion. Then, last week, the FDA’s Technical Conformance Guide v4.8 was published. Three additional pages of text were added under a new section called, “Scope of SEND

Well, what does it say? It opens with the line, “The following is the Agency’s current thinking of the scope of SEND…”. So, everything stated is just the ‘current thinking’, and this target is moving.

It then broadly states that SEND datasets are required for as many studies as possible, an idea probably best captured in the line, “If the nonclinical pharmacology or toxicology study is required to support a regulatory decision by the Agency… then the nonclinical study would require SEND.”

Within this sweeping statement, the document then deals with some of the specifics, particularly focusing on areas that we’ve heard questioned over the past year. What happens when one study type incorporates endpoints from another study type? The document uses the example of general toxicity studies that incorporate cardiovascular safety pharmacology or genetic toxicity. It states that the study is still in scope for SEND with the expectation that any data which could be rendered in SEND, should be rendered in SEND.

The guide states that the age of the subject doesn’t impact the decision as to whether or not SEND is required. It states that SEND is required for juvenile studies as long as the study does not “…include multiple phases [which] cannot currently be modelled…

It further states that, “The requirement for SEND is not limited to the drug substance” nor “…study report status or the finalization of the study report…” Also, regarding the GLP status “As both GLP and non-GLP toxicity studies may be submitted to the FDA to support clinical safety, the decision for inclusion of SEND is independent of GLP status.

Over the course of three pages, the agency is attempting to be as clear and helpful as they can, to show that if a study can be rendered in SEND, for a study that informs the decision for the clinical safety of the drug, then those study data should be in SEND.

However, the text is littered with caveats and exceptions, and so appears both explicit and vague at the same time. For this reason, it encourages sponsors to enter into “discussion with the review division when there is any ambiguity on the SEND requirement…”. We have to appreciate that this is an effort to clarify the expectations, and though it seems to leave many questions unanswered, one thing is certain, they are encouraging the submission of SEND datasets.

So, does your study require SEND? That question just got a whole lot harder to answer, but certainly the answer is now more likely to be, “Yes.”

Till next time,


Why Define-XML files give me the Happy Mondays

During my formative years, there was a band from the north of England, not far from where I was growing up, called the Happy Mondays. Probably the most notable thing about them was one bandmember called ‘Bez’. His contribution was somewhere between cheerleading and performance art, as all he appeared to do was dance like a maniac while shaking maracas like his life depended on it. Nobody really knew why, yet he was a key member of the band. We couldn’t hear the maracas, so he wasn’t adding anything musically, but somehow the band would not have been the same without him.

Bez is the best metaphor I can think of to describe how most of our community views that strange little xml file that accompanies the SEND xpt files as part of the study package: the Define-XML file. It doesn’t contain data, just information about the data. We are not quite sure what it’s adding, we just know the package wouldn’t be complete without it.

I’ve heard it said that the FDA does not use the Define files. I’ve also heard it said that they do use them, and they are necessary for loading the data. I have seen FDA feedback provided to sponsors which remarks on the Define file. Still, I think there’s some confusion across the industry regarding what the FDA actually do with the Define file.

Most commercial SEND solutions produce some default Define file populated with some basic information, usually based off a generic template and often, not too study specific. The file follows the Define-XML standard, but it isn’t really adding any value. Usually, the intention is that such files are the starting point and it’s assumed that the organization will complete them manually to tailor them to the individual study. From what I have seen, some organizations are doing exactly this, and manually editing the XML. However, some organizations are not doing this and instead, simply supply the default file that has been auto generated.

The three reasons why some organizations only supply the default file, are quite clear:

  • They are unsure of the value of a well-formed Define file
  • They do not have the necessary tools to produce a well-formed Define file
  • They do not have the necessary expertise

This has been the way for some time now, but I’m starting to see a change in the tide here. As mentioned earlier, the FDA often include discussion of the quality of the Define file in their sponsor feedback. Also, tools will continue to be developed and improved to accommodate this.

The Define-XML standard is something separate from the SEND standard. The standard is the same for both clinical datasets (SDTM) and nonclinical (SEND). For this reason, this week I’ve been learning a little about how Define files are both produced and used in the clinical world. It appears that clinical have tapped into the value and purpose of the Define file, yet in nonclinical, we often still view them in a similar way to how my teenage self, viewed that maraca shaking curio.

Maybe in a future post, we can discuss some of the opportunities afforded by having well-formed Define files.

Till next time,


Can SEND datasets be fully compliant…but still wrong?

In a recent post, we discussed how there’s quite a bit of emphasis at the moment on ensuring SEND datasets are compliant with the SEND Standard. Obviously, the main driver here is the activation of the FDA’s technical rejection criteria, which will result in the agency automatically rejecting applications which do not meet the required criteria. We’ve also seen the CDISC’s move into automated compliance checking with the initiation of the ‘CORE’ project. In fact, the whole topic of ‘compliance’ seems to be the biggest talking point in the world of SEND right now.

It should go without saying that no matter how compliant the datasets are, they also need to be a correct representation of the study data. However, it’s clear that some organizations are putting effort into compliance but forgetting the more basic principle of making sure the data are correct. I mean, there’s no point having beautifully rendered SEND datasets, with all variables populated and formatted in accordance with the standard, if the actual values are incorrect.

For clarification, I’m referring to occasions when a result in a SEND dataset doesn’t match the value in the study report. Typical issues can be things like having result data, say bodyweights, which are being reported in SEND for dates long after the subject was terminated; or negative lab results that should never be negative. All examples of data which can be rendered in a compliant manner, but still just obviously ‘wrong’.

Some of us may struggle to believe that could really occur, but it’s surprisingly easy to see how that can happen. Some organizations will collect data, complete the study, produce the PDF tables, review them and any such errors found would be corrected in the tables directly. At some point later, the SEND datasets are produced from an export from the data collection system. At this point, any corrections in the tables are not reflected in the data collection system and therefore not in the SEND Datasets.

Other practices and process which don’t consider SEND from the outset, can result in similar issues. For this reason, I continually quote what I consider to be one of the most important, and most often overlooked statements in the FDA’s Technical Conformance Guide (Section General Considerations) “The ideal time to implement SEND is prior to the conduct of the study as it is very important that the results presented in the accompanying study report be traceable back to the original data collected.

So, the first issue is that without proper forethought, incorrect results can occur. The second issue then is that they may be difficult and expensive to detect.

To a large degree, automated tools can be developed to check conformance, particularly with the publication of CDISC conformance rules and FDA validation rules. However, checking that a result in a SEND dataset matches the corresponding PDF table is something that still requires a human touch.

So yes, we need to ensure conformance to the standard, but how much more important is it to ensure that results themselves are correct?

Till next time,


Did you see the recent paper from the JPMA SEND Taskforce Team?

Okay, first – some context…

Without the FDA requiring SEND datasets, we would not have seen the industry-wide adoption and implementation of the standard. The change made by the industry, continues to fascinate me, in terms of both speed and scale.

This drive for submission provides us with a well-defined standard, and one that is well suited to single study review. However, it is becoming increasingly more apparent that there are some shortcomings for cross study analysis and data mining. The reason for this is that, while SEND allows for accurate representation of a study’s results in electronic, machine readable form, it also allows for a significant variability from study to study.

Now that I have set the scene,  I’d like to discuss a recent paper by the Japan Pharmaceutical Manufacturers Association (JPMA) SEND Taskforce Team. This describes their analysis of multiple SEND packages from a variety of suppliers. It details key areas of variability in how data are represented from study to study. If data mining and cross study analysis gets you as giddy-as-a-kiddy on a Christmas morning, then I’d highly recommend that you take a deep dive into the paper for yourself. It contains very detailed results, calling out things like the specific variables that are most prone to variation between providers.

One of the key areas that the paper discusses is the scope and application of SEND Controlled Terminology (CT). It will not come as a surprise to anyone who is routinely working with SEND Datasets, that many key variables do not have CT defined for them. They allow for a free text description. The paper calls out many examples, including Clinical Signs where even variables like the test name, and the severity are not controlled.

Stepping away from Clinical Signs, the discussion on CT reminded me of work being conducted by PHUSE regarding the lack of CT for the Vehicle being used on the study. While, for single study analysis, a free text description is perfectly adequate, when it comes to data mining, the lack of CT proves problematic. For this specific issue, PHUSE are recommending a particular structure, format and nomenclature be used to describe the vehicle.

Such recommendations, to enforce supplementary rules and standardization – essentially, further CT in addition to the regular SEND CT – adds complexity to the creation of SEND datasets. That complexity will then increase the time and cost to produce SEND Datasets. That discussion will open up another debate, which I’ll leave for a different day.

Suffice to say, that SEND provides an accurate representation of a study’s results in electronic form, well-suited to single study review. However, there are shortcomings relating to multi-study usage, but these can be overcome. The JPMA paper does a very good job of calling out these issues to address.

As usual, drop me a note if you like to discuss this further

Till next time,