1

I have a small program that imports a csv file into a sql server database but when you press the button twice or something it adds duplicates. I need it so that when there are duplicate it skips that one. if someone can help me with the code, it would be awesome.

EDIT: i noticed some saying i need to disable the button while it is working while that is a solution for one thing i also want that if there is already something in the database that it will skip that part when it is the same in csv file as in the database.

code:

private void button1_Click(object sender, EventArgs e)
    {
        SqlConnection con = new SqlConnection(@"server=localhost;Initial Catalog=klantbestand;Integrated Security=SSPI;");
            string filepath = @"C:\clients TEST.csv";

        StreamReader sr = new StreamReader(filepath);

        string line = sr.ReadLine();
        string[] value = line.Split(';');
        DataTable dt = new DataTable();
        DataRow row;

        foreach (string dc in value)
        {
            dt.Columns.Add(new DataColumn(dc));
        }

        while (!sr.EndOfStream)
        {
            value = sr.ReadLine().Split(';');
            if (value.Length == dt.Columns.Count)
            {
                row = dt.NewRow();
                row.ItemArray = value;
                dt.Rows.Add(row);
            }
        }

        SqlBulkCopy bc = new SqlBulkCopy(con.ConnectionString, SqlBulkCopyOptions.TableLock);
        bc.DestinationTableName = "GegevensCSV";
        bc.BatchSize = dt.Rows.Count;
        con.Open();
        bc.WriteToServer(dt);
        bc.Close();
        con.Close();
    }
2
  • Wouldn't it be a safer option to make some kind of mutex or disabling the button while it's working? Commented Sep 27, 2016 at 10:08
  • well it is not just when you press the button twice. it is when there is already a value in the database that is the same as in the csv file. Commented Sep 27, 2016 at 11:02

2 Answers 2

3

The general principle I follow when bulk loading data into a database table is to create a new staging table (that will only exist temporarily for the duration of the import), bulk load all the data into that, and then run a query to migrate that data into the final destination table. After that, just drop the staging table.

This gives a few benefits:

  1. You can increase initial bulk loading performance into the db/reduce initial contention, without impacting other processes using the target table (e.g. SqlBulkCopyOptions.TableLock)
  2. You can then implement things in that 2nd migration step, such as not copying over data that already exists in the final destination table (as you are wanting to do)
Sign up to request clarification or add additional context in comments.

1 Comment

could you help me on my way with some code. i am not really that well know with programming yet.
2

The answer from @AdaTheDev is right. But there is an another way for doing this.

If there is problem of duplicate insert while doing bulk import then you can also handle it by using Stored Procedure instead of using Bulk Copy Function.

In SQL Server for the versions later than 2005 you can use "Table-Valued Parameters" i.e. You can pass a whole table as a parameter into the Stored Procedure and manipulate it in the server side.

If you pass your table to the server side through Stored Procedure parameter than you can make use of "Merge Command", where merge command is an upsert command. i.e. you can insert as well as update or delete the required record(s) from the same command in the safest and quickest way.

Here are some details about the process:

Step 1: Create as Table-Valued Parameter in SQL Server; The command is:

CREATE TYPE [dbo].[TableTypeName] AS TABLE(
    [ColumnName1] [DataType],
    [ColumnName2] [DataType],
    [ColumnName3] [DataType]
)
GO

Here "ColumnName1,2,3" are the names of the table columns and "DataType" is the sql server data type assigned for the columns.

Step 2: Create the Stored Procedure with Merge Command as:

CREATE PROCEDURE [dbo].[ProcedureName]

    @TableTypeName [dbo].[TableTypeName] READONLY

AS 
BEGIN

    DECLARE @InsertedRowsId TABLE 
    (
        [InsertedRowId] [DataType] NOT NULL
    );
    DELETE FROM @InsertedRowsId;

    BEGIN TRY

        BEGIN TRANSACTION 

            -- Merge command 
            MERGE INTO [dbo].[TableName] AS [Target]
            USING (     
                    SELECT * FROM @TableTypeName                            
                ) AS [Source]
            -- Candidate Keys: All the column(s) combination that makes the record(s) unique.
            ON [Target].[ColumnName1] = [Source].[ColumnName1] -- Always false, ensures all rows copied
            AND [Target].[ColumnName2] = [Source].[ColumnName2] 
            AND [Target].[ColumnName2] = [Source].[ColumnName2] 

            WHEN NOT MATCHED THEN   
                INSERT 
                    (
                        [ColumnName1]
                        ,[ColumnName1]
                        ,[ColumnName1]
                    )
                VALUES
                    (
                        [Source].[ColumnName1]
                        ,[Source].[ColumnName1]
                        ,[Source].[ColumnName1]
                    );          

        COMMIT TRANSACTION

    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION
    END CATCH

END 

Step 3: Now the final step of calling the Stored Procedure.

Private void button1_Click(object sender, EventArgs e)
{
    string connectionString = @"server=localhost;Initial Catalog=klantbestand;Integrated Security=SSPI;";
    string filepath = @"C:\clients TEST.csv";

    StreamReader sr = new StreamReader(filepath);

    string line = sr.ReadLine();
    string[] value = line.Split(';');
    DataTable dt = new DataTable();
    DataRow row;

    foreach (string dc in value)
    {
        dt.Columns.Add(new DataColumn(dc));
    }

    while (!sr.EndOfStream)
    {
        value = sr.ReadLine().Split(';');
        if (value.Length == dt.Columns.Count)
        {
            row = dt.NewRow();
            row.ItemArray = value;
            dt.Rows.Add(row);
        }
    }

    if (dt.Rows.Count>0) {
        using (SqlConnection connection = new SqlConnection(connectionString)) {
        connection.Open();
        using (SqlCommand command = connection.CreateCommand()) {
                command.CommandText = "dbo.ProcedureName";
                command.CommandType = CommandType.StoredProcedure;

                SqlParameter parameter;            
                parameter = command.Parameters.AddWithValue("@TableTypeName", dt);              
                parameter.SqlDbType = SqlDbType.Structured;
                parameter.TypeName = "dbo.TableTypeName";

                command.ExecuteNonQuery();
            }
        }
    }        
}

In this way, you can bulk import your data without having any duplicate record(s). Even, you can log the insert or exception details if required.

2 Comments

so i tried this and i got the exeption An unhandled exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll Additional information: String or binary data would be truncated. The data for table-valued parameter "@TableTypeName" doesn't conform to the table type of the parameter. SQL Server error is: 8152, state: 10 The statement has been terminated.
The exception is of type string truncation i.e. The size defined for varchar field is not enough. Secondly, FOR ""@TableTypeName" doesn't conform to the table type of the parameter" refers that the parameter name defined in the Stored Procedure is not matching to the parameter name set in the SqlParameter

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.