Data Migration

From CaisisWiki

(Difference between revisions)
Jump to: navigation, search
(Step 4: Identify existing patients)
(Step 6: Migration Other Tables)
 
(15 intermediate revisions not shown)
Line 1: Line 1:
-
[[Category:DataManagement]]
+
[[Category:Caisis]]
-
[[Category:F2F2009]]
+
[[Category:Caisis Database]]
-
{|class=wikitable cellpading=0
+
-
|bgcolor=999999 width="8"| ||bgcolor=white|[[Image:Wikipuzzlepiece50px.png]]|| '''Please help [{{fullurl:{{FULLPAGENAME}}|action=edit}} expand this article.]'''<small>
+
-
|}
+
-
 
+
==Before You Start==
==Before You Start==
===Goal===
===Goal===
-
Import historical data (legacy system) into Caisis tables (SQL Server 2005) using a staging database (MS Access 2003/2007).
+
Import historical data (legacy system) into Caisis tables (SQL Server 2005) using staging tables.
*The staging database is a test environment that mirrors the destination database (in this case, Caisis).
*The staging database is a test environment that mirrors the destination database (in this case, Caisis).
*The source data can be in any file type, but we either import the data into our staging database or link to it (via linked tables).
*The source data can be in any file type, but we either import the data into our staging database or link to it (via linked tables).
Line 18: Line 14:
*Buy-in owners of the data (or head of group being migrated into Caisis).
*Buy-in owners of the data (or head of group being migrated into Caisis).
===Necessary skills===  
===Necessary skills===  
-
*SQL
+
*SQL Server Integration Services (SSIS)
*T-SQL
*T-SQL
-
*MS Access
+
 
===Necessary knowledge===  
===Necessary knowledge===  
*knowledge of terminology (i.e. disease-specific vocabulary) used in source system.
*knowledge of terminology (i.e. disease-specific vocabulary) used in source system.
Line 27: Line 23:
*knowledge of Caisis model.
*knowledge of Caisis model.
-
==Why MS Access?==
+
 
-
*Easy to use: Visual interface for building and organizing queries.
+
-
*Distribution of workload
+
-
**analyst with MS Access skills can stage the data while your DBA can maintain their day-to-day tasks and append the data to production Caisis.
+
-
**It's possible that the analyst doesn't have access to your SQL Server database.
+
-
*Maintains referential integrity between primary and foreign keys.
+
-
*Maintains structural integrity such as required fields, data types, and max characters.
+
-
*SQL Server can easily link to MS Access.
+
==Map Source Data to Caisis==
==Map Source Data to Caisis==
Line 71: Line 60:
#*must work with expert of source system
#*must work with expert of source system
#What if the source data point doesn’t fit?
#What if the source data point doesn’t fit?
-
#*make sure it is a true data point and not derived (calculated)
+
##make sure it is a true data point and not derived (calculated)
-
#*one option is to update Caisis data model (requires development skills to update application)
+
##one option is to update Caisis data model (requires development skills to update application)
-
#*alternative option: migrate points that fit, store key to legacy data and re-visit the gap(s) after data model is updated to accommodate.
+
##virtual field (more on this in a later document)
-
#*if you want to add a field, let us know – we may want to add it to the core structure.
+
##alternative option: migrate points that fit, store key to legacy data in '''RelatedRecords''' table and re-visit the gap(s) after data model is updated to accommodate.
 +
##if you want to add a field, let us know – we may want to add it to the core structure.
-
==Staging in MS Access==
+
==Staging Tables==
-
Goal: create a staging database that mirrors Caisis contraints and structure.
+
Goal: create staging tables in SQL that mirror Caisis contraints and structure.  This is important to make sure that you data is in a format suitable for Caisis to consume.
-
  Note: At this point, you need someone with DBA/SQL Server skills.  Gets technical.
+
  Note: At this point, you will need some T-SQL language knowledge.  Gets technical.
-
===Step 1: Create "Caisis shell" in MS Access===
+
===Append source data to staging tables (based on mapping)===
-
*Create “empty shell” of Caisis tables in an MS Access database via '''SQL Server Import and Export Wizard''' (DTS package wizard).
+
-
**http://msdn.microsoft.com/en-us/library/ms140052(lightweight).aspx
+
-
*Tip: the Wizard will only export to MS Access 2003 file type (Access 2007 can create a 2003 file).
+
-
 
+
-
===Step 2: Connect to source data from within MS Access Staging Database===
+
-
*Best practice: link to the data source by creating linked tables rather than importing source data into staging database.
+
-
**Why?: This helps with versioning issues (prevents branching caused by separate copies of the source data).  Any changes made to the source data will be automatically available in the staging database.
+
-
 
+
-
===Step 3: Append source data to staging tables (based on mapping)===
+
*Best practice: populate Patients table first - even if you do not plan to import data to Patients table in Caisis (i.e. patients already exist in Caisis).
*Best practice: populate Patients table first - even if you do not plan to import data to Patients table in Caisis (i.e. patients already exist in Caisis).
Line 95: Line 76:
*Don't forget required fields (i.e. EnteredBy, EnteredTime, etc.).
*Don't forget required fields (i.e. EnteredBy, EnteredTime, etc.).
*Be careful with Memo data types in MS Access.  May truncate in Caisis (i.e. max notes column is 1000).
*Be careful with Memo data types in MS Access.  May truncate in Caisis (i.e. max notes column is 1000).
-
**When SQL Server Export Wizard generates Caisis staging tables in your MS Access staging database, it estimates the best option for data type.  For instance, notes fields in Caisis (i.e. PathNotes) are varchar(1000) which translates to a Memo field in MS AccessIf you stage notes data in this MS Access field, it may exceed the 1000 character limit of the Caisis notes field.
+
*Caisis has a DateText field that goes along with each Date field.  Make sure that all DateText fields are formatted as text.
-
*DateText fields should be formatted as text. In MS Access, use the format expression: <code>Format([Date], "mm/dd/yyyy")</code>
+
==Connect SQL Server to Staging DB==
==Connect SQL Server to Staging DB==
-
*Create linked server between SQL Server and your MS Access staging database.
+
*If your source data is in MS Access, you can create linked server between SQL Server and your MS Access database.
**Provider: '''Microsoft Jet 4.0 OLE DB Provider'''
**Provider: '''Microsoft Jet 4.0 OLE DB Provider'''
-
==Staging DB To SQL Server 2005 (T-SQL Script)==
+
==Source DB To SQL Server 2005 (T-SQL Script)==
-
This section provides snippets of T-SQL that can be used to build a migration script for importing data from a staging datababse into Caisis database.
+
This section provides snippets of T-SQL that can be used to build a migration script for importing data from a source datababse into Caisis database.
<font color=red>'''Disclaimer:'''
<font color=red>'''Disclaimer:'''
Line 109: Line 89:
'''*The following code is intended to serve as a sample and should not be considered a production-ready solution.  Different systems and migrations have different requirements and your migration script should be written to suit your specific needs.'''
'''*The following code is intended to serve as a sample and should not be considered a production-ready solution.  Different systems and migrations have different requirements and your migration script should be written to suit your specific needs.'''
-
'''*These scripts should not be run during regular operating hours.  You should schedule your migration at off-peak time.'''</font>
+
'''*These scripts should not be run during regular operating hours.  You should schedule your migration at off-peak time.'''
-
 
+
-
===Step 1: Disable all trigers===
+
-
<code>
+
-
EXEC spUtilityToggleTriggers 'DISABLE', '%'
+
-
</code>
+
-
 
+
-
===Step 2: Add temporary columns to parent tables===
+
-
*These columns will store the primary keys from your staging database. 
+
-
*This helps maintain the relationships between parent/child tables that were established in your staging database. 
+
-
*This is only necassary for tables that have child tables associated with them (if you will insert data into those child tables).
+
-
 
+
-
<code>
+
-
EXEC spUtilityToggleColumn 'Patients', 'ADD', 'StagingPatientId varchar(50) null'
+
-
 
+
-
EXEC spUtilityToggleColumn 'Encounters', 'ADD', 'StagingPK int null'
+
-
</code>
+
-
 
+
-
===Step 3: Add 'MatchedPatient' column to Patients table===
+
-
*In the next step, we will insert a bit into this column that identifies if a patient already exists in the Patients table.
+
-
 
+
-
<code>
+
-
EXEC spUtilityToggleColumn 'Patients', 'ADD', 'MatchedPatient bit null'
+
-
 
+
-
GO
+
-
</code>
+
-
===Step 4: Identify existing patients===
+
'''*This script is intended to serve as a solution that does not modify production tables.  Instead, it uses temporary staging tables to enforce referential integrity between source and production tables.'''
-
*Identify patients that already exist in Caisis -- Matching records
+
-
UPDATE Patients
+
'''*This document outlines one approach. The Caisis Team is can not be responsible for any loss of data or any other occurrence while using this approach.'''
-
SET Patients.MatchedPatient = 1, Patients.SourcePatientId = b.PatientId
+
</font>
-
FROM Patients INNER JOIN StagingDB.dbo.Patients b ON Patients.PtMRN = b.PtMRN
+
-
===Step 5: Patients table===
+
===Step 1: Declare Variables===
-
Insert records in Patients table for non-matching (don't already exist) patients
+
CREATE TABLE #vars (vname varchar(50), vvalue varchar(50), ivalue int)
 +
--Set the institution Id assigned to the data set you’re migrating (in this example 10):
 +
INSERT INTO #vars (vname, ivalue) values ('InstitutionId', 10)
 +
INSERT INTO #vars (vname, vvalue) values ('EnteredBy', 'DataMigration')
-
<source lang="tsql">
+
===Step 2: Fill in destination_PatientId for existing pts===
-
INSERT INTO Patients
+
*In staging environment, store primary key PatientId of production.
-
    (PtMRN, LungPatientId, PtTitle, PtFirstName, PtMiddleName, PtLastName, PtSuffix, PtAlias, PtMaidenName,
+
-
            PtGender, PtCompany, PtBusinessCity, PtBusinessState, PtAddress1, PtAddress2, PtCity, PtState,
+
-
            PtPostalCode, PtCountry, PtEmail, PtBirthDateText, PtBirthDate, PtBirthPlace, PtDeathDateText,
+
-
            PtDeathDate, PtDeathType, PtDeathCause, PtContactPerson, PtContactAddress1, PtContactAddress2,
+
-
            PtContactCity, PtContactState, PtContactPostalCode, PtContactPhone, PtSpouse, PtRace, PtEthnicity,
+
-
            PtLanguage, PtContactStatus, PtNotes, EnteredBy, EnteredTime, UpdatedBy, UpdatedTime, LockedBy, LockedTime)
+
-
SELECT a.PtMRN, a.PatientId, a.PtTitle, a.PtFirstName, a.PtMiddleName, a.PtLastName, a.PtSuffix, a.PtAlias,
+
  -- Fill in Destination_PatientId for existing pts.
-
a.PtMaidenName,a.PtGender, a.PtCompany, a.PtBusinessCity, a.PtBusinessState, a.PtAddress1, a.PtAddress2,
+
-- Destination_PatientId is the PatientId (primary key) in the destination Patients table.
-
a.PtCity, a.PtState,a.PtPostalCode, a.PtCountry, a.PtEmail, a.PtBirthDateText, a.PtBirthDate, a.PtBirthPlace,
+
-- This will provide a mapping between the source Patients table record and the destination Patients table record.
-
a.PtDeathDateText,a.PtDeathDate, a.PtDeathType, a.PtDeathCause, a.PtContactPerson, a.PtContactAddress1,
+
UPDATE Staging.dbo.Patients
-
a.PtContactAddress2,a.PtContactCity, a.PtContactState, a.PtContactPostalCode, a.PtContactPhone, a.PtSpouse,
+
SET Destination_PatientId = b.PatientId
-
a.PtRace, a.PtEthnicity,a.PtLanguage, a.PtContactStatus, a.PtNotes, a.EnteredBy, a.EnteredTime,
+
FROM Staging.dbo.Patients a INNER JOIN Production.dbo.Patients b ON a.PtMRN = b.PtMRN
-
        a.UpdatedBy, a.UpdatedTime, a.LockedBy, a.LockedTime
+
-
FROM StagingDB.dbo.Patients a LEFT OUTER JOIN Patients ON a.PtMRN = Patients.PtMRN  
+
-
WHERE Patients.PtMRN IS NULL
+
-
</source>
+
-
===Step 6: PatientInstitutions===
+
===Step 3: Add new patients===
-
*Set the institution Id assigned to the data set you’re migrating
+
INSERT INTO Production.dbo.Patients (PtMRN, EnteredBy, EnteredTime)
-
*Note: if this is a new institution (Caisis data set), you must first add it to the Institutions table. Use the InsitutionId value that is generated in the Institutions table as your <code>@InstitutionId</code> parameter below.
+
SELECT PtMRN, EnteredBy, GETDATE()
 +
  FROM Staging.dbo.Patients
 +
WHERE Destination_PatientId IS NULL
-
<source lang="tsql">
+
===Step 4: Fill in Destination_PatientId for new pts===
-
DECLARE @InstitutionId int
+
-
SET @InstitutionId = 1
+
-
INSERT INTO PatientInstitutions
+
/*
-
(PatientId, InstitutionId, EnteredTime, UpdatedTime )
+
Since MRN is unique per patient and also unique per record in the Patients table,
-
SELECT PatientId,@InstitutionId,GETDATE(), GETDATE()
+
it can therefore be used as natural join between source Patients table
-
FROM Patients  
+
and destination Patients table.
-
WHERE Patients.StagingPatientId IS NOT NULL AND
+
*/
-
        PatientId NOT IN (SELECT PatientId  
+
UPDATE Staging.dbo.Patients
-
                          FROM PatientInstitutions
+
SET Destination_PatientId = b.PatientId
-
                          WHERE InstitutionId = @InstitutionId)
+
FROM Staging.dbo.Patients a INNER JOIN Production.dbo.Patients b ON a.PtMRN = b.PtMRN
-
</source>
+
WHERE a.Destination_PatientId IS NULL
-
===Step 7: Migrate other tables===
+
===Step 5: PatientInstitutions===
-
The example below demonstrates how to join associated parent/child tables using the temporary '''StagingPK''' column.
+
*Set the institution Id assigned to the data set you’re migrating
 +
*Note: if this is a new institution (Caisis data set), you must first add it to the Institutions table.
-
<source lang="tsql">
+
INSERT INTO Production.dbo.PatientInstitutions (PatientId, InstitutionId, EnteredTime, UpdatedTime )
-
--ENCOUNTER--------------------
+
SELECT Destination_PatientId, (SELECT ivalue FROM #vars WHERE vname = 'InstitutionId'),
 +
      EnteredTime, EnteredTime
 +
FROM Staging.dbo.Patients
 +
WHERE Destination_PatientId NOT IN (SELECT PatientId
 +
                                    FROM Production.dbo.PatientInstitutions
 +
                                    WHERE InstitutionId = (SELECT ivalue FROM #vars
 +
                                                          WHERE vname = 'InstitutionId'))
-
INSERT INTO Encounters
+
===Step 6: Migration Other Tables===
-
(PatientId, LungPK, EncPending, EncDateText, EncDate, EncType, EncChiefComplaint, EncPhysician,  
+
-------FAMILY MEMBERS------------------------------------------------------------------
-
EncFellow, EncNurse, EncPresentation, EncStatus, EncBP, EncPulse, EncTemp, EncRespiration, EncHeight,
+
-- Add new FamilyMembers records based on this specific sorting.
-
EncWeight, EncBSA, EncBMI, EncKPS, EncECOG_Score, EncZubrod_Score, EncInstitution, EncNotes, EncDataSource,
+
INSERT INTO Production.dbo.FamilyMembers
-
EncQuality, EnteredBy, EnteredTime, UpdatedBy, UpdatedTime, LockedBy, LockedTime )
+
  (PatientId, FamMemNum, FamMemSide, FamMemRelation, FamMemDeathAge,  
-
SELECT b.PatientId, a.EncounterId, a.EncPending, a.EncDateText, a.EncDate, a.EncType, a.EncChiefComplaint,
+
  EnteredBy, EnteredTime, UpdatedBy, UpdatedTime, LockedBy, LockedTime)
-
a.EncPhysician, a.EncFellow, a.EncNurse, a.EncPresentation, a.EncStatus, a.EncBP, a.EncPulse, a.EncTemp,
+
SELECT b.Destination_PatientId, FamMemNum, FamMemSide, FamMemRelation, FamMemDeathAge,  
-
a.EncRespiration, a.EncHeight, a.EncWeight, a.EncBSA, a.EncBMI, a.EncKPS, a.EncInstitution, a.EncECOG_Score,
+
      a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime, a.LockedBy, a.LockedTime
-
a.EncZubrod_Score, a.EncNotes, a.EncDataSource, a.EncQuality,a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime,  
+
FROM Staging.dbo.FamilyMembers a INNER JOIN Staging.dbo.Patients b ON a.PatientId = b.PatientId
-
a.LockedBy, a.LockedTime  
+
-- NOTE: It is important ORDER BY the production table's identity column (in this case FamilyMemberId)
-
FROM StagingDB.dbo.Encounters a INNER JOIN Patients b ON a.PatientId = b.StagingPatientId
+
-- so that the records sort sequentially.
 +
ORDER BY a.PatientId ASC, a.FamilyMemberId ASC
-
--EncExams---------------------
+
-- Fill in Dest_FamilyMemberId for new records added
 +
UPDATE a
 +
SET Destination_FamilyMemberId = dest.FamilyMemberId
 +
FROM Staging.dbo.FamilyMembers a
-
INSERT INTO EncExams
+
INNER JOIN
-
    EncounterID, ExamSystem, Exam, ExamResult, ExamAttendingPerformed, ExamNotes,
+
(
-
    EnteredBy, EnteredTime, UpdatedBy, UpdatedTime)
+
-- NOTE: Use the same fields in ROW_NUMBER that you used in ORDER BY.
-
SELECT     b.EncounterID, a.ExamSystem, a.Exam, a.ExamResult, a.ExamAttendingPerformed, a.ExamNotes,
+
-- The combination of these fields (in this example a.PatientId and a.FamilyMemberId) should be unique.
-
    a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime
+
-- We are doing this to generate matching SequenceNum in both Staging and Production.
-
FROM     StagingDB.dbo.EncExams a INNER JOIN Encounters b ON a.EncounterId = b.StagingPK
+
-- They will be joined in the next statement.
-
</source>
+
SELECT b.Destination_PatientId, a.PatientId, a.FamilyMemberId,
 +
      ROW_NUMBER() OVER (ORDER BY a.PatientId ASC, a.FamilyMemberId ASC) SequenceNum
 +
FROM Staging.dbo.FamilyMembers a INNER JOIN Staging.dbo.Patients b ON a.PatientId = b.PatientId
 +
) src ON a.PatientId = src.PatientId AND a.FamilyMemberId = src.FamilyMemberId
-
===Step 8: Drop temporary columns===
+
-- NOTE: SequenceNum in Staging will be joined to SequenceNum in Production:
-
<source lang="tsql">
+
-- NOTE: In the (ORDER BY FamilyMemberId ASC) below, make sure to use the table's primary key.
-
EXEC spUtilityToggleColumn 'Patients', 'DROP', 'StagingPatientId'
+
INNER JOIN
-
EXEC spUtilityToggleColumn 'Encounters', 'DROP', 'StagingPK'
+
(
-
</source>
+
SELECT PatientId, FamilyMemberId, ROW_NUMBER() OVER (ORDER BY FamilyMemberId ASC) SequenceNum
 +
FROM Production.dbo.FamilyMembers
 +
WHERE EnteredBy = (SELECT vvalue FROM #vars WHERE vname = 'EnteredBy')
 +
) dest ON src.SequenceNum = dest.SequenceNum AND src.Destination_PatientId = dest.PatientId
-
===Step 9: Enable triggers===
+
-------FAMILY MEMBERS DX---------------------------------------------------------------
-
<source lang="tsql">
+
-- Add new FamilyMembersDx records
-
EXEC spUtilityToggleTriggers 'ENABLE','%'
+
INSERT INTO Production.dbo.FamilyMemberDiagnosis
-
</source>
+
  (FamilyMemberId, FamMemDiagnosis, FamMemDiagnosisAge, FamMemNotes,  
 +
  EnteredBy, EnteredTime, UpdatedBy, UpdatedTime)
 +
SELECT b.Dest_FamilyMemberId, FamMemDiagnosis, FamMemDiagnosisAge, FamMemNotes,
 +
      a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime
 +
FROM Staging.dbo.FamilyMemberDiagnosis a INNER JOIN Staging.dbo.FamilyMembers b ON
 +
                                                    a.FamilyMemberId = b.FamilyMemberId
-
===Step 10: Test the migration===
+
===Step 7: Test the migration===
*Structure of Caisis may change during time spent on migration.
*Structure of Caisis may change during time spent on migration.
*Source system (if active during migration) may have changed, requiring update of script.
*Source system (if active during migration) may have changed, requiring update of script.
Line 235: Line 203:
*Map source data to Caisis (spreadsheet) [10% of effort]
*Map source data to Caisis (spreadsheet) [10% of effort]
*Staging database [50% effort]:
*Staging database [50% effort]:
-
**Step 1: Use SQL Server Import and Export Wizard to create staging database in MS Access.
+
**Step 1: Create staging tables in SQL.
-
**Step 2: Link to source data from within staging database.
+
**Step 2: Link to source data.
**Step 3: Write append queries to insert source data into Caisis staging tables.
**Step 3: Write append queries to insert source data into Caisis staging tables.
-
*Connect SQL Server to staging database (via Linked Server). [<1% of effort]
+
*OPTIONAL: Connect SQL Server to staging database (via Linked Server). [<1% of effort]
*Write T-SQL script to insert data into Caisis tables and test migration. [20% of effort]
*Write T-SQL script to insert data into Caisis tables and test migration. [20% of effort]
*Misc. unforeseen issues [10% of effort]: waiting for access to source data, gathering necessary knowledge, etc.
*Misc. unforeseen issues [10% of effort]: waiting for access to source data, gathering necessary knowledge, etc.

Current revision as of 21:26, 25 August 2011


Contents

Before You Start

Goal

Import historical data (legacy system) into Caisis tables (SQL Server 2005) using staging tables.

  • The staging database is a test environment that mirrors the destination database (in this case, Caisis).
  • The source data can be in any file type, but we either import the data into our staging database or link to it (via linked tables).

Assumptions

  • These migrations are for a 1-time dump of data.
  • The legacy system should be retired after migration.
  • You have access to the source data.
  • Buy-in owners of the data (or head of group being migrated into Caisis).

Necessary skills

  • SQL Server Integration Services (SSIS)
  • T-SQL

Necessary knowledge

  • knowledge of terminology (i.e. disease-specific vocabulary) used in source system.
    • Ensures that the source vocabulary matches what you have been using in Caisis.
  • knowledge of source system model.
  • knowledge of Caisis model.


Map Source Data to Caisis

  • Field-by-field mapping of source data to specific Caisis table(s) and field(s).
Sample mapping
Source Field CommentsCaisisNotes
IDPrimary key DO NOT IMPORT
Med_rec_no Patients.PtMRN
Surgery_date OperatingRoomDetails.OpDate, Procedures.ProcDate
DOBPatients.PtBirthDate
AgeCALCULATED – DO NOT IMPORT
Patient_sex Patients.PtGender
Patient_last Patients.PtLastName
Patient_first Patients.PtFirstName
Resp_MD Procedures.ProcSurgeon ProcSurgeonType=Attending
Best practice: include owner(s) of data and/or stakeholders in this process.

Tips for Mapping Your Data

  1. How do you know where to put the data in Caisis?
    • you must understand Caisis structure
    • must work with expert of source system
  2. What if the source data point doesn’t fit?
    1. make sure it is a true data point and not derived (calculated)
    2. one option is to update Caisis data model (requires development skills to update application)
    3. virtual field (more on this in a later document)
    4. alternative option: migrate points that fit, store key to legacy data in RelatedRecords table and re-visit the gap(s) after data model is updated to accommodate.
    5. if you want to add a field, let us know – we may want to add it to the core structure.

Staging Tables

Goal: create staging tables in SQL that mirror Caisis contraints and structure. This is important to make sure that you data is in a format suitable for Caisis to consume.

Note: At this point, you will need some T-SQL language knowledge.  Gets technical.

Append source data to staging tables (based on mapping)

  • Best practice: populate Patients table first - even if you do not plan to import data to Patients table in Caisis (i.e. patients already exist in Caisis).

Tips for Staging Your Data

  • Don't forget required fields (i.e. EnteredBy, EnteredTime, etc.).
  • Be careful with Memo data types in MS Access. May truncate in Caisis (i.e. max notes column is 1000).
  • Caisis has a DateText field that goes along with each Date field. Make sure that all DateText fields are formatted as text.

Connect SQL Server to Staging DB

  • If your source data is in MS Access, you can create linked server between SQL Server and your MS Access database.
    • Provider: Microsoft Jet 4.0 OLE DB Provider

Source DB To SQL Server 2005 (T-SQL Script)

This section provides snippets of T-SQL that can be used to build a migration script for importing data from a source datababse into Caisis database.

Disclaimer:

*The following code is intended to serve as a sample and should not be considered a production-ready solution. Different systems and migrations have different requirements and your migration script should be written to suit your specific needs.

*These scripts should not be run during regular operating hours. You should schedule your migration at off-peak time.

*This script is intended to serve as a solution that does not modify production tables. Instead, it uses temporary staging tables to enforce referential integrity between source and production tables.

*This document outlines one approach. The Caisis Team is can not be responsible for any loss of data or any other occurrence while using this approach.

Step 1: Declare Variables

CREATE TABLE #vars (vname varchar(50), vvalue varchar(50), ivalue int)
--Set the institution Id assigned to the data set you’re migrating (in this example 10):
INSERT INTO #vars (vname, ivalue) values ('InstitutionId', 10)
INSERT INTO #vars (vname, vvalue) values ('EnteredBy', 'DataMigration')

Step 2: Fill in destination_PatientId for existing pts

  • In staging environment, store primary key PatientId of production.
-- Fill in Destination_PatientId for existing pts.
-- Destination_PatientId is the PatientId (primary key) in the destination Patients table.
-- This will provide a mapping between the source Patients table record and the destination Patients table record.
UPDATE Staging.dbo.Patients
SET Destination_PatientId = b.PatientId
FROM Staging.dbo.Patients a INNER JOIN Production.dbo.Patients b ON a.PtMRN = b.PtMRN

Step 3: Add new patients

INSERT INTO Production.dbo.Patients (PtMRN, EnteredBy, EnteredTime)
SELECT PtMRN, EnteredBy, GETDATE()
FROM Staging.dbo.Patients
WHERE Destination_PatientId IS NULL

Step 4: Fill in Destination_PatientId for new pts

/*
Since MRN is unique per patient and also unique per record in the Patients table,
it can therefore be used as natural join between source Patients table 
and destination Patients table.
*/
UPDATE Staging.dbo.Patients
SET Destination_PatientId = b.PatientId
FROM Staging.dbo.Patients a INNER JOIN Production.dbo.Patients b ON a.PtMRN = b.PtMRN
WHERE a.Destination_PatientId IS NULL

Step 5: PatientInstitutions

*Set the institution Id assigned to the data set you’re migrating
*Note: if this is a new institution (Caisis data set), you must first add it to the Institutions table.
INSERT INTO Production.dbo.PatientInstitutions (PatientId, InstitutionId, EnteredTime, UpdatedTime )
SELECT Destination_PatientId, (SELECT ivalue FROM #vars WHERE vname = 'InstitutionId'), 
      EnteredTime, EnteredTime
FROM Staging.dbo.Patients
WHERE Destination_PatientId NOT IN (SELECT PatientId
                                   FROM Production.dbo.PatientInstitutions
                                   WHERE InstitutionId = (SELECT ivalue FROM #vars 
                                                         WHERE vname = 'InstitutionId'))

Step 6: Migration Other Tables

-------FAMILY MEMBERS------------------------------------------------------------------
-- Add new FamilyMembers records based on this specific sorting. 
INSERT INTO Production.dbo.FamilyMembers
 (PatientId, FamMemNum, FamMemSide, FamMemRelation, FamMemDeathAge, 
  EnteredBy, EnteredTime, UpdatedBy, UpdatedTime, LockedBy, LockedTime)
SELECT b.Destination_PatientId, FamMemNum, FamMemSide, FamMemRelation, FamMemDeathAge, 
      a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime, a.LockedBy, a.LockedTime
FROM Staging.dbo.FamilyMembers a INNER JOIN Staging.dbo.Patients b ON a.PatientId = b.PatientId
-- NOTE: It is important ORDER BY the production table's identity column (in this case FamilyMemberId)
-- so that the records sort sequentially.
ORDER BY a.PatientId ASC, a.FamilyMemberId ASC
-- Fill in Dest_FamilyMemberId for new records added
UPDATE a
SET Destination_FamilyMemberId = dest.FamilyMemberId
FROM Staging.dbo.FamilyMembers a 
INNER JOIN 
(
-- NOTE: Use the same fields in ROW_NUMBER that you used in ORDER BY.  
-- The combination of these fields (in this example a.PatientId and a.FamilyMemberId) should be unique.
-- We are doing this to generate matching SequenceNum in both Staging and Production.  
-- They will be joined in the next statement.
SELECT b.Destination_PatientId, a.PatientId, a.FamilyMemberId,
      ROW_NUMBER() OVER (ORDER BY a.PatientId ASC, a.FamilyMemberId ASC) SequenceNum
FROM Staging.dbo.FamilyMembers a INNER JOIN Staging.dbo.Patients b ON a.PatientId = b.PatientId
) src ON a.PatientId = src.PatientId AND a.FamilyMemberId = src.FamilyMemberId
-- NOTE: SequenceNum in Staging will be joined to SequenceNum in Production:
-- NOTE: In the (ORDER BY FamilyMemberId ASC) below, make sure to use the table's primary key.
INNER JOIN
(
SELECT PatientId, FamilyMemberId, ROW_NUMBER() OVER (ORDER BY FamilyMemberId ASC) SequenceNum
FROM Production.dbo.FamilyMembers
WHERE EnteredBy = (SELECT vvalue FROM #vars WHERE vname = 'EnteredBy')
) dest ON src.SequenceNum = dest.SequenceNum AND src.Destination_PatientId = dest.PatientId
-------FAMILY MEMBERS DX---------------------------------------------------------------
-- Add new FamilyMembersDx records
INSERT INTO Production.dbo.FamilyMemberDiagnosis
 (FamilyMemberId, FamMemDiagnosis, FamMemDiagnosisAge, FamMemNotes, 
  EnteredBy, EnteredTime, UpdatedBy, UpdatedTime)
SELECT b.Dest_FamilyMemberId, FamMemDiagnosis, FamMemDiagnosisAge, FamMemNotes, 
      a.EnteredBy, a.EnteredTime, a.UpdatedBy, a.UpdatedTime
FROM Staging.dbo.FamilyMemberDiagnosis a INNER JOIN Staging.dbo.FamilyMembers b ON 
                                                   a.FamilyMemberId = b.FamilyMemberId

Step 7: Test the migration

  • Structure of Caisis may change during time spent on migration.
  • Source system (if active during migration) may have changed, requiring update of script.
  • Check to make sure data is in the right place after you run the script.
    • if possible, view data in UI using a test instance of Caisis.

Percent Effort Estimates

Tip: Overestimate your timeline when creating a roadmap for a migration as it is difficult to truly assess the work involved in a migration until you are at the staging database step.

  • Strategy and planning [10% of effort]
  • Map source data to Caisis (spreadsheet) [10% of effort]
  • Staging database [50% effort]:
    • Step 1: Create staging tables in SQL.
    • Step 2: Link to source data.
    • Step 3: Write append queries to insert source data into Caisis staging tables.
  • OPTIONAL: Connect SQL Server to staging database (via Linked Server). [<1% of effort]
  • Write T-SQL script to insert data into Caisis tables and test migration. [20% of effort]
  • Misc. unforeseen issues [10% of effort]: waiting for access to source data, gathering necessary knowledge, etc.
Personal tools