For this week I am going to go through a process that I developed and am pretty proud of.
We the DBA Team are responsible for getting data in to our Integration environments and higher. We do not trust the developers with this responsibility as they would probably break something.
Issue
Recently one of our development teams has increased the request of importing an Excel file with 20 sheets in to 20 tables in a database from about once a quarter to multiple times a week and this past Monday was three times in one day. I have been the lucky DBA to get these requests as of late and after Monday I was determined to fix the process. The current procedure is to use the good ol’ Import/Export Wizard since this was a rare request. (This included a lot of point and click and possibility for manual error) With increased requests and increased table counts I knew there had to be a better way to get this accomplished without grimacing each time I see the request.
Solution
So with my increased interest in PowerShell and trying to learn and merge PowerShell and SQL I thought first how can I do this with PowerShell? So I took to the #PowerShellHelp channel on Slack. Needless to say the first couple suggestions were that sounds great for SSIS. Well, I did not ask in the #SSISHelp channel is my thought in my head. Then a suggestion to look at the ImportExcel module in the PSGallery. Then my buddy Constantine Kokkinos (b | t) endorsed the suggestion and gave me encouragement. So off I went to build a solution to import my Excel sheets in to a SQL database.
First I installed the ImportExcel module from the PowerShell Gallery. Then I did an Import-Module so I could take a look at the functions and help. Needless to say the function I was using did not have a lot of help so I ended up looking at the PowerShell code a little to better understand what the function could do.
As I started putting the code together I got more and more excited. My wife was pre-occupied with something and my son was with his nanny, so I took the free time to just flow with it.
As I was going through and testing I think I found a bug in the ConvertFrom-ExcelToSQLInsert in that it put single quotes around the columns of the insert statements that were built. After getting failures in my testing I was able to figure that out. Luckily I have the source code and can easily fix that and that is what I did. Now all my imports would work and I did not have to have use the replace function to fix it.
What I had when I was done was the working crewed code you see below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
#Install-Module -Name ImportExcel -RequiredVersion 4.0.8 Import-Module ImportExcel -Force #$AffordabilityLine = @() $DatabaseName = 'RiseACQDecisioning' Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -InputFile C:\1-Testing\Risk_LineFile_Load\RSE_LinesListDrop.sql Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -InputFile C:\1-TESTING\Risk_LineFile_Load\RSE_CreateLine.sql $LineFilePath = "C:\1-TESTING\Risk_LineFile_Load\LinesFile.xlsx" ## AffordabilityLine Table Import $AffordabilityLine = ConvertFrom-ExcelToSQLInsert -TableName AffordabilityLine_NEW -Path $LineFilePath -WorkSheetname "AffordabilityLine" -HeaderRow 1 #$InsertAffordabilityLine = $AffordabilityLine.Replace("'ID', 'StateId', 'Line', 'ApprovedLine'","ID, StateId, Line, ApprovedLine") $AffordabilityLine.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertAffordabilityLine -join "`n") ## APRByLine Table Import $APRByLine = ConvertFrom-ExcelToSQLInsert -TableName APRByLine_NEW -Path $LineFilePath -WorkSheetname "APRByLine" -HeaderRow 1 #$InsertAPRByLine = $APRByLine.Replace("'StateId', 'Line', 'APR', 'Channel'","StateId, Line, APR, Channel") $APRByLine.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertAPRByLine -join "`n") ## LineAssignment Table Import $LineAssignment = ConvertFrom-ExcelToSQLInsert -TableName LineAssignment_NEW -Path $LineFilePath -WorkSheetname "LineAssignment" -HeaderRow 1 #$InsertLineAssignment = $LineAssignment.Replace("'LineAssignmentId', 'MinIncomeAmount', 'MaxApprovedLineAssignment', 'ApprovedLineAssignment', 'StateCode', 'MaxIncomeAmount'","LineAssignmentId, MinIncomeAmount, MaxApprovedLineAssignment, ApprovedLineAssignment, StateCode, MaxIncomeAmount") $LineAssignment.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ("SET IDENTITY_INSERT LineAssignment_NEW ON;" + $InsertLineAssignment -join "`n" + "SET IDENTITY_INSERT LineAssignment_NEW OFF") ## Lines Table Import $Lines = ConvertFrom-ExcelToSQLInsert -TableName Lines_NEW -Path $LineFilePath -WorkSheetname "Lines" -HeaderRow 1 #$InsertLines = $Lines.Replace("'LineId', 'LineStrategyCode', 'RiskScore', 'LineAmount', 'APR'","LineId, LineStrategyCode, RiskScore, LineAmount, APR") $Lines.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ("SET IDENTITY_INSERT Lines_NEW ON;" + $InsertLines -join "`n" + "SET IDENTITY_INSERT Lines_NEW OFF") ## LineStrategy Table Import $LineStrategy = ConvertFrom-ExcelToSQLInsert -TableName LineStrategy_NEW -Path $LineFilePath -WorkSheetname "LineStrategy" -HeaderRow 1 #$InsertLineStrategy = $LineStrategy.Replace("'LineStrategyId', 'LineStrategyCode', 'Segment', 'CampaignType', 'StateId', 'Weight'","LineStrategyId, LineStrategyCode, Segment, CampaignType, StateId, Weight") $LineStrategy.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ("SET IDENTITY_INSERT LineStrategy_NEW ON;" + $InsertLineStrategy -join "`n" + "SET IDENTITY_INSERT LineStrategy_NEW OFF") ## NonDMPricingTestPTI Table Import $NonDMPricingTestPTI = ConvertFrom-ExcelToSQLInsert -TableName NonDMPricingTestPTI_NEW -Path $LineFilePath -WorkSheetname "NonDMPricingTestPTI" -HeaderRow 1 #$InsertNonDMPricingTestPTI = $NonDMPricingTestPTI.Replace("'LineStrategyId', 'StateCode', 'MaxApprovedLineAssignment', 'APR', 'MinIncomeAmount', 'MaxIncomeAmount', 'ApprovedLineAssignment'","LineStrategyId, StateCode, MaxApprovedLineAssignment, APR, MinIncomeAmount, MaxIncomeAmount, ApprovedLineAssignment") $NonDMPricingTestPTI.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertNonDMPricingTestPTI -join "`n") ## PTILineAssignment Table Import $PTILineAssignment = ConvertFrom-ExcelToSQLInsert -TableName PTILineAssignment_NEW -Path $LineFilePath -WorkSheetname "PTILineAssignment" -HeaderRow 1 #$InsertPTILineAssignment = $PTILineAssignment.Replace("'PTILineAssignmentId', 'PTILineAssignmentAmount', 'StateCode', 'APR', 'MinTerm', 'DefaultTerm', 'MaxTerm', 'ChannelCode'","PTILineAssignmentId, PTILineAssignmentAmount, StateCode, APR, MinTerm, DefaultTerm, MaxTerm, ChannelCode") $PTILineAssignment.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ("SET IDENTITY_INSERT PTILineAssignment_NEW ON;" + $InsertPTILineAssignment -join "`n" + "SET IDENTITY_INSERT PTILineAssignment_NEW OFF") ## PTILinesTest Table Import $PTILinesTest = ConvertFrom-ExcelToSQLInsert -TableName PTILinesTest_NEW -Path $LineFilePath -WorkSheetname "PTILinesTest" -HeaderRow 1 #$InsertPTILinesTest = $PTILinesTest.Replace("'StateCode', 'archetype', 'MinIncomeAmount', 'MaxIncomeAmount', 'LineAssignmentAmount', 'MinRiskScore', 'MaxRiskScore', 'Test'","StateCode, archetype, MinIncomeAmount, MaxIncomeAmount, LineAssignmentAmount, MinRiskScore, MaxRiskScore, Test") $PTILinesTest.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertPTILinesTest -join "`n") ## ServiceLine Table Import $ServiceLine = ConvertFrom-ExcelToSQLInsert -TableName ServiceLine_NEW -Path $LineFilePath -WorkSheetname "ServiceLine" -HeaderRow 1 #$InsertServiceLine = $ServiceLine.Replace("'Id', 'StateId', 'LineAmount'","Id, StateId, LineAmount") $ServiceLine.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertServiceLine -join "`n") ## StateLine Table Import $StateLine = ConvertFrom-ExcelToSQLInsert -TableName StateLine_NEW -Path $LineFilePath -WorkSheetname "StateLine" -HeaderRow 1 #$InsertStateLine = $StateLine.Replace("'StateId', 'Line'","StateId, Line") $StateLine.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertStateLine -join "`n") ## StateMaxAPR Table Import $StateMaxAPR = ConvertFrom-ExcelToSQLInsert -TableName StateMaxAPR_NEW -Path $LineFilePath -WorkSheetname "StateMaxAPR" -HeaderRow 1 #$InsertStateMaxAPR = $StateMaxAPR.Replace("'Id', 'State', 'Lines', 'MaxAPR'","Id, State, Lines, MaxAPR") $StateMaxAPR.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertStateMaxAPR -join "`n") ## StateMaxLines Table Import $StateMaxLines = ConvertFrom-ExcelToSQLInsert -TableName StateMaxLines_NEW -Path $LineFilePath -WorkSheetname "StateMaxLines" -HeaderRow 1 #$InsertStateMaxLines = $StateMaxLines.Replace("'Id', 'State', 'MaxLines', 'ProductId'","Id, State, MaxLines, ProductId") $StateMaxLines.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ($InsertStateMaxLines -join "`n") ## YodleeLines Table Import $YodleeLines = ConvertFrom-ExcelToSQLInsert -TableName YodleeLines_NEW -Path $LineFilePath -WorkSheetname "YodleeLines" -HeaderRow 1 #$InsertYodleeLines = $YodleeLines.Replace("'YodleeLineId', 'StateCode', 'LineAmount', 'MinFraudScore', 'MaxFraudScore', 'MinChargeOffScore', 'MaxChargeOffScore', 'MinCBBScore', 'MaxCBBScore'","YodleeLineId, StateCode, LineAmount, MinFraudScore, MaxFraudScore, MinChargeOffScore, MaxChargeOffScore, MinCBBScore, MaxCBBScore") $YodleeLines.Count $Result = Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -Query ("SET IDENTITY_INSERT YodleeLines_NEW ON;" + $InsertYodleeLines -join "`n" + "SET IDENTITY_INSERT YodleeLines_NEW OFF") |
So as you can see, very redundant code blocks. The next day Friday, I was determined to make the code follow better practices.
So what I did first was figure out if the ImportExcel module had a function that would read all the Sheets inside of my Workbook. Sure enough there was a Get-ExcelSheetInfo that I was able to read the Sheet names in the Workbook and store them in a variable. Now I can look through Sheets and perform my actions.
Now since I had the code that needed to be used with the ConvertFrom-ExcelToSQLInsert and had all the values that changed for each table I built my foreach loop. Then I hit an issue where some of my tables had an identity column that needed to be accounted for. So after thinking about it I made a variable with the SET IDENTITY_INSERT value for the tables that needed it and a different variable for ones that did not.
Then I added output so that I could see the record count loaded in to each table to send to the development team for validation.
See the improved code below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
#Install-Module -Name ImportExcel -RequiredVersion 4.0.8 Import-Module ImportExcel -Force $result_list = @() $DatabaseName = 'RiseACQDecisioning' $ServerName = 'localhost\sql2016' Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -InputFile C:\1-Testing\Risk_LineFile_Load\RSE_LinesListDrop.sql Invoke-Sqlcmd -ServerInstance localhost\sql2016 -Database $DatabaseName -InputFile C:\1-TESTING\Risk_LineFile_Load\RSE_CreateLine.sql $LineFilePath = "C:\1-TESTING\Risk_LineFile_Load\LinesFile.xlsx" ## Get Sheet Information $WorkbookInfo = Get-ExcelSheetInfo -Path $LineFilePath | Where-Object Hidden -EQ 'visible' | Sort-Object Name ## Loop through the Sheets foreach ($Sheet in $WorkbookInfo) { $InsertRecords = ConvertFrom-ExcelToSQLInsert -TableName "$($Sheet.Name)_NEW" -Path $LineFilePath -WorkSheetname "$($Sheet.Name)" -HeaderRow 1 #$OutputResults += "WorkSheet: $($Sheet.Name) | Count: $($InsertRecords.Count)" If ($Sheet.Name -in 'LineAssignment','Lines','LineStrategy','PTILineAssignment','YodleeLines') { $Result = Invoke-Sqlcmd -ServerInstance $ServerName -Database $DatabaseName -Query ("SET IDENTITY_INSERT $($Sheet.Name)_NEW ON;" + $InsertRecords -join "`n" + "SET IDENTITY_INSERT $($Sheet.Name)_NEW OFF") } Else { $Result = Invoke-Sqlcmd -ServerInstance $ServerName -Database $DatabaseName -Query ($InsertRecords -join "`n") } $result_list += @{ Name = "$($Sheet.Name)_NEW"; Results = $( Invoke-SqlCmd -ServerInstance $ServerName -Database $DatabaseName -Query "SELECT COUNT(*) AS 'row_count' FROM $($Sheet.Name)_NEW" | Select-Object -ExpandProperty row_count)} } $result_list.ForEach({[PSCustomObject]$_}) | Format-Table -AutoSize |
Conclusion
As I am still building my development skills, this is kind of my working process. Get a working model, then come back and improve code. I still have a couple of additions I would like to add, but I have a working procedure that will save time for me and anyone on my team that gets this request in the future.