Sometimes last year, I was working on this Desktop application (VB.NET) that had to read a large file and run some complex analysis to give output in various formats.
Anyway, reading the large file took the better part of 5 weeks and I more or less tried all suggested solutions (from stackoverflow & co and MSDN website itself). Nothing.
I tried datareaders, datatables, tableadapters, xmlserializer, etc ( the whole microsoft file reading arsenal). I cracked my brain alongside my friend and system analyst Adelore Oreoluwa (he's awesome). They were all either too slow, or ineffective. (Some read part of the file and skipped some, some misplaced part of my input data, some simply hung the entire application and consumed large memory).
I finally fixed the problem by reading all my data directly from Excel 2003 (.xls) format, using a dll file got from some spanish programmer friends. The code snippets are below (in case you ever run into same problem). It read me over 20,000 rows of data in less than 3 seconds and misplaced none of my input.
Imports Excel //Of course you know where to put this
Private Sub CheckExcel()
Dim result As DataSet = Nothing
If Path.GetExtension(Me.txtfilename.Text).ToLower().Equals(".xls") Then
'Call the ReadExcelFile function
Call ReadExcelFile(result)
'Check that the excel file matches your input template
Call CheckTemplate(result)
Else
End If
End Sub
Private Sub ReadExcelFile(ByRef result As DataSet)
Dim stream As FileStream = File.Open(Me.OpenFileDialog1.FileName, FileMode.Open, FileAccess.Read)
Dim excelReader As IExcelDataReader
excelReader = ExcelReaderFactory.CreateBinaryReader(stream)
result = excelReader.AsDataSet()
Return
End Sub
Private Sub CheckTemplate(ByVal result As DataSet)
Try
objTable = result.Tables(0)
Dim fields(9) As String
For Each row As DataRow In objTable.Rows
For j As Int16 = 0 To 8
If row(j).ToString <> String.Empty Then
fields(j) = row(j).ToString
End If
Next
Call loadtobase(fields)
Next
'MessageBox.Show(objTable.Rows.Count.ToString + " Records Loaded", "Success")
Catch ex As Exception
' MessageBox.Show(ex.ToString)
Exit Sub
End Try
End Sub
||||||--------------------------------------------------------------------------------------|||||||||||||||||||
This code is in VB.Net, I'm sure if C#, F#, or Visual C++ is your thing, you can easily translate to fit your needs. (I can help with the C#, just contact me).
The dll has to be placed in the debug folder, something like: C:\Users\Username\Documents\Visual Studio 2012\Projects\Zeed REgiz\SeedlingRegistry1\SeedlingRegistry\bin\Debug\Excel.dll
(Adjust path to suit your System settings).
The dll file can be found on my github: github.com/C-spydo . It will work for any .NET language.