oledb - get column names Jet OLE DB in vb.net -
i've written function reads csv files , parametrizes them accordingly, therefore have function gettypessql queries sql table @ first data types , therefore adjust columns later inserted in sql. problem when set hdr=yes in jet ole db column names f1, f2, f3. circumvent issue i've set hdr=no , written loops empty strings, problem? here code:
private function getcsvfile(byval file string, byval min integer, byval max integer) datatable dim constr string = "provider=microsoft.jet.oledb.4.0;data source=" & textbox1.text & ";extended properties=""text;hdr=no;imex=1;fmt=delimited;characterset=65001""" dim conn new oledb.oledbconnection(constr) dim dt new datatable dim da oledb.oledbdataadapter = nothing getdata = nothing try dim cmd string = "select * " & _table & ".csv" da = new oledb.oledbdataadapter(cmd, conn) da.fill(min, max, dt) getdata = new datatable(_table) dim firstrow datarow = dt.rows(0) integer = 0 dt.columns.count - 1 dim columnname string = firstrow(i).tostring() dim newcolumn new datacolumn(columnname, mlistoftypes(i)) getdata.columns.add(newcolumn) next integer = 1 dt.rows.count - 1 dim row datarow = dt.rows(i) dim newrow datarow = getdata.newrow() j integer = 0 getdata.columns.count - 1 if row(j).gettype gettype(string) dim colvalue string = row(j).tostring() colvalue = changeencoding(colvalue) colvalue = parsestring(colvalue) colvalue = replacechars(colvalue) newrow(j) = colvalue else newrow(j) = row(j) end if next getdata.rows.add(newrow) application.doevents() next catch ex oledbexception messagebox.show(ex.message) catch ex exception messagebox.show(ex.message) dt.dispose() da.dispose() end try return getdata end function
and types sql, 1 doesn't convert properly, doubles
private sub gettypessql() if (mlistoftypes nothing) mlistoftypes = new list(of type)() end if mlistoftypes.clear() dim dttabelshema datatable = db.getdatatable("select top 0 * " & _table) using dttabelshema each col datacolumn in dttabelshema.columns mlistoftypes.add(col.datatype) next end using end sub
i think have made more complicated needs be. instance, dbschema creating empty datatable
, harvesting datatypes it. why not use first table rather creating new table types? table need not reconstructed on , on each batch of rows imported.
generally since oledb
try infer types data, seems unnecessary , may in way in cases. also, redoing oledb , copying data different dt. given that, i'd skip overhead oledb imposes , work raw data.
this creates destination table using csv column name , type database. if csv not in same column order delivered in select *
query, fail.
the following uses class map csv columns db table columns code not depending on csvs being in same order (since may generated externally). sample data csv not in same order:
public class csvmapitem public property csvindex int32 public property colname string = "" 'optional public property datatype type public sub new(ndx int32, csvname string, dtcols datacolumncollection) csvindex = ndx each dc datacolumn in dtcols if string.compare(dc.columnname, csvname, true) = 0 colname = dc.columnname datatype = dc.datatype exit end if next if string.isnullorempty(colname) throw new argumentexception("cannot find column: " & csvname) end if end sub end class
the code parse csv uses csvhelper
in case textfieldparser
used since code reads csv rows string array.
dim sql = string.format("select * {0} id<0", dbtblname) dim rowcount int32 = 0 dim totalrows int32 = 0 dim sw new stopwatch sw.start() using dbcon new mysqlconnection(mysqlconnstr) using cmd new mysqlcommand(sql, dbcon) dtsample = new datatable dbcon.open() ' load empty dt, create insert command dasample = new mysqldataadapter(cmd) dim cb = new mysqlcommandbuilder(dasample) dasample.insertcommand = cb.getinsertcommand dtsample.load(cmd.executereader()) ' dtsample not empty, has columns ' need dim csvmap new list(of csvmapitem) using sr new streamreader(csvfile, false), parser = new csvparser(sr) ' col names csv dim csvnames = parser.read() ' create map of csv index dt columnname see note n int32 = 0 csvnames.length - 1 csvmap.add(new csvmapitem(n, csvnames(n), dtsample.columns)) next ' line data read string dim data string() data = parser.read() dim dr datarow until data nothing orelse data.length = 0 dr = dtsample.newrow() each item in csvmap ' optional/as needed type conversion if item.datatype = gettype(boolean) ' "1" wont convert bool, (int)1 dr(item.colname) = convert.toint32(data(item.csvindex).trim) else dr(item.colname) = data(item.csvindex).trim end if next dtsample.rows.add(dr) rowcount += 1 data = parser.read() if rowcount = 50000 orelse (data nothing orelse data.length = 0) totalrows += dasample.update(dtsample) ' empty table if there more 100k rows dtsample.rows.clear() rowcount = 0 end if loop end using end using end using sw.stop() console.writeline("parsed , imported {0} rows in {1}", totalrows, sw.elapsed.totalminutes)
the processing loop updates db every 50k rows in case there many many rows. in 1 pass rather reading n rows thru oledb @ time. csvparser
read 1 row @ time, there should never more 50,001 rows worth of data on hand @ time.
there may special cases handle type conversions shown if item.datatype = gettype(boolean) then
. boolean column read in "1" cant directly passed boolean column, converted integer can. there other conversions such funky dates.
time process 250,001 rows: 3.7 mins. app needs apply string transforms every single string column take longer. i'm pretty sure using csvreader
in csvhelper
have applied part of parsing type.
there potential disaster waiting happen since meant all-purpose importer/scrubber.
for integer = 0 dt.columns.count - 1 dim columnname string = firstrow(i).tostring() dim newcolumn new datacolumn(columnname, mlistoftypes(i)) getdata.columns.add(newcolumn) next
both question , self-answer build new table using column names csv , datatypes select *
query on destination table. so, assumes csv columns in same order select *
return them, , csvs use same names tables.
the answer above marginally better in finds , matches based on name.
a more robust solution write little utility app user maps db column name csv index. save results list(of csvmapitem)
, serialize it. there whole collection of these saved disk. then, rather creating map based on dead reckoning, deserialize desired user csvmap
in above code.
Comments
Post a Comment