Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in SQL by (6.1k points)
closed by

SELECT REPLACE('<strong>100</strong><b>.00 GB', '%^(^-?\d*\.{0,1}\d+$)%', '');

I have to replace any markup between two parts of the number with the above regex, but it isn’t working. I don’t know if it is regex syntax is wrong because I tried a simpler one such as '%[^0-9]%' just to test but it didn't work either. Please help me out.

closed

4 Answers

0 votes
by (11.7k points)
selected by
 
Best answer

I suggest you to use PATINDEX to find the first index of the pattern (string's) occurrence. Then use STUFF to stuff another string into the pattern(string) matched.

Iterate through each row. You need to replace every illegal character with what you want. In your scenario, replace non numeric with blank. The inner loop is if you have more than one illegal character in a current cell of the loop.

DECLARE @counter int

SET @counter = 0

WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table))

BEGIN 

    WHILE 1 = 1

    BEGIN

        DECLARE @RetVal varchar(50)

        SET @RetVal =  (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')

        FROM Table

        WHERE ID_COLUMN = @counter)

        IF(@RetVal IS NOT NULL)       

          UPDATE Table SET

          Column = @RetVal

          WHERE ID_COLUMN = @counter

        ELSE

            break

    END

    SET @counter = @counter + 1

END

Caution: This one is slow. Having a varchar column may impact. So using LTRIM RTRIM may help a bit. But it is slow.It still iterates the max of the number of non-numeric characters in a single row, so it's not ideal, but I think it should be acceptable in most situations.

WHILE 1 = 1 BEGIN

    WITH q AS

        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n

        FROM Table)

    UPDATE Table

    SET Column = STUFF(Column, q.n, 1, '')

    FROM q

    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 BREAK;

END;

Efficiency can also be improved quite a lot if you maintain a bit column in the table that indicates whether the field has been scrubbed yet.

DECLARE @done bit = 0;

WHILE @done = 0 BEGIN

    WITH q AS

        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n

        FROM Table

        WHERE COALESCE(Scrubbed_Column, 0) = 0)

    UPDATE Table

    SET Column = STUFF(Column, q.n, 1, ''),

        Scrubbed_Column = 0

    FROM q

    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 SET @done = 1;

    -- if Scrubbed_Column is still NULL, then the PATINDEX

    -- must have given 0

    UPDATE table

    SET Scrubbed_Column = CASE

        WHEN Scrubbed_Column IS NULL THEN 1

        ELSE NULLIF(Scrubbed_Column, 0)

    END;

END;

If you want to get more insights of SQL, checkout this SQL Course from Intellipaat.

0 votes
by (13k points)

The REPLACE function in SQL Server does not support regular expression patterns. It is designed to replace a specific substring with another string. Therefore, you cannot directly use a regular expression pattern within the REPLACE function.

If you want to remove markup between two parts of a number using a regular expression, you can use other programming languages or tools that support regex functionality. You would need to retrieve the string from the database, perform the regex replacement using the appropriate programming language or tool, and then update the database with the modified string.

For example, if you are using a programming language like Python, you can retrieve the string from the database, use the re module to perform the regex replacement, and update the database with the modified string.

Here's an example using Python:

import re

import pyodbc

# Connect to the database

conn = pyodbc.connect('your_connection_string')

# Retrieve the string from the database

cursor = conn.cursor()

cursor.execute("SELECT YourColumn FROM YourTable")

row = cursor.fetchone()

string = row.YourColumn

# Perform the regex replacement

modified_string = re.sub('<.*?>', '', string)

# Update the database with the modified string

cursor.execute("UPDATE YourTable SET YourColumn = ? WHERE YourCondition", modified_string)

conn.commit()

# Close the connection

conn.close()

In this example, the re.sub function is used to replace all occurrences of <.*?> (which matches any markup between angle brackets) with an empty string.

Please note that the code provided is just an example and may require adjustments based on your specific database and programming language.

0 votes
by (11.4k points)
In SQL Server's `REPLACE` function, the pattern used for replacement is not based on regular expressions but rather on simple string matching. Therefore, you cannot use regular expression syntax directly in the `REPLACE` function.

If you want to remove any non-digit characters from a string, such as markup in your case, you can use a combination of string manipulation functions like `PATINDEX` and `STUFF` in SQL Server.

Here's an example query that achieves the desired result:

DECLARE @inputString NVARCHAR(100) = '<strong>100</strong><b>.00 GB';

WHILE PATINDEX('%[^0-9.]%', @inputString) > 0

BEGIN

  SET @inputString = STUFF(@inputString, PATINDEX('%[^0-9.]%', @inputString), 1, '')

END

SELECT @inputString AS Result;

In this example, the `PATINDEX` function is used to find the position of the first occurrence of a non-digit character (`[^0-9.]`) in the string. The `STUFF` function then replaces that character with an empty string. The loop continues until all non-digit characters are removed.

The result will be `'100.00'`, as the markup and non-digit characters have been removed from the original string.

Keep in mind that this solution assumes that the input string will always follow a specific format and contains only numbers and the dot character for decimal values. Adjustments may be needed depending on your specific requirements.
0 votes
by (7.8k points)
In SQL, the `REPLACE` function does not support regular expressions natively. It is designed to perform simple string replacements based on exact matches. Therefore, you cannot use regular expressions directly within the `REPLACE` function.

To achieve your goal of replacing any markup between two parts of a number, you will need to use a combination of string manipulation functions in SQL. Here's an example using the `PATINDEX` function:

DECLARE @input VARCHAR(100) = '<strong>100</strong><b>.00 GB';

DECLARE @start_index INT, @end_index INT;

-- Find the starting and ending index of the markup

SET @start_index = PATINDEX('%<%>%', @input) + 1;

SET @end_index = PATINDEX('%<%>%', SUBSTRING(@input, @start_index, LEN(@input) - @start_index + 1));

-- Replace the markup with an empty string

SET @input = STUFF(@input, @start_index, @end_index, '');

SELECT @input;

In this example, the `PATINDEX` function is used to find the starting and ending index of the markup (in this case, the `<strong>` and `</strong>` tags). The `STUFF` function is then used to replace the markup with an empty string.

Note that this approach assumes there is only one set of markup in the input string. If you have multiple sets of markup, you may need to modify the logic accordingly.

Remember to replace `@input` with your actual column name or string value.

Browse Categories

...