Parse an integer from a string with trailing garbage

I need to parse a decimal integer that appears at the start of a string.

There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)

e.g.

"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2

Is there a built-in method in the .NET framework to do this?

int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.

It would be quite easy to implement this but I would prefer to use the standard method if it exists.

13.10.2009 16:13:08
I'm assuming you hate regular expressions, but I think that's the kind of problem they're meant to solve...
axel_c 13.10.2009 16:17:13
Using a regular expression is fine. But if there's a built-in function that would be preferable.
finnw 13.10.2009 16:32:50
Is a valid "integer" character always followed or only ever preceded by a space character?
ChrisBD 13.10.2009 16:33:10
@ChrisBD, there are not necessarily any spaces at all. But the first non-space character is always a digit.
finnw 13.10.2009 16:56:12
Michael Freidgeim 8.08.2016 06:42:43
10 ОТВЕТОВ
РЕШЕНИЕ
foreach (var m in Regex.Matches(" 3 - .x. 4", @"\d+"))
{
    Console.WriteLine(m);
}

Updated per comments

Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.

To get first int:

Match match = Regex.Match(" 3 - .x. - 4", @"\d+");
if (match.Success)
    Console.WriteLine(int.Parse(match.Value));
17
13.10.2009 16:47:22
I only need the first number, so you could stick a 'break' in there.
finnw 13.10.2009 16:40:21
@finnw: Was confused by the comment you made on another answer. To get the first value use the Regex.Match function, it can be seen in one of my Rollbacks.
Yuriy Faktorovich 13.10.2009 16:42:49
@Yuriy, I was referring to multi-digit numbers (e.g. "42"), not multiple numbers in the string.
finnw 13.10.2009 16:48:59
string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;

for (x = 0; x < s.length; x++) 
{

  if (int.TryParse(s[x], out i))
     collectedNumber += s[x];
  else
     break;     // not a number - that's it - get out.

} 

if (int.TryParse(collectedNumber, out i))
    Console.WriteLine(i); 
else
    Console.WriteLine("no number found");
1
13.10.2009 16:25:57
That will only parse one digit. The number may have multiple digits.
finnw 13.10.2009 16:17:27
@finnw- then just throw another if statement inside the first one to iterate to the following position to check
TStamper 13.10.2009 16:19:51
@finnw Ok, here is another iteration that handles multiple numbers
AngryHacker 13.10.2009 16:26:16

There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).

Will the result always be non-negative (which would make things easier)?

To be honest, regular expressions are the easiest option here, but...

public static string RemoveCruftFromNumber(string text)
{
    int end = 0;

    // First move past leading spaces
    while (end < text.Length && text[end] == ' ')
    {
        end++;
    }

    // Now move past digits
    while (end < text.Length && char.IsDigit(text[end]))
    {
        end++;
    }

    return text.Substring(0, end);
}

Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).

6
13.10.2009 16:45:27
The garbage is at the end of the string, not the start (I do not consider the leading space to be garbage, since the built-in functions like int.Parse can handle that.)
finnw 13.10.2009 16:25:54
Okay, edited. (Was this the reason for the downvote? If not, I'd be interested to hear what it was for...)
Jon Skeet 13.10.2009 16:34:26
"Was this the reason for the downvote? If not, I'd be interested to hear what it was for..." it's like Federer bitching about the ref telling him to be quiet.
Yuriy Faktorovich 13.10.2009 16:51:23
@Yuriy: I'm afraid I don't understand your comment. I always like to hear why I'm being downvoted, so that I can improve my answer. @finnw: Yes, this answer could very easily be simplified to a regex - I didn't do so based on your expression of dislike for regexes in the question :) Let me know if you want me to put that in the answer.
Jon Skeet 13.10.2009 17:07:51
Unless you edited your question, he may well have been unable to remove the downvote. The system is unfortunate that way sometimes.
Jon Skeet 13.10.2009 18:24:50

I'm not sure why you would avoid Regex in this situation.

Here's a little hackery that you can adjust to your needs.

" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);

public static class CharArrayExtensions
{
    public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
    {
        foreach (var c in array)
        {
            if(char.IsNumber(c))
                yield return c;
        }
    }
}

EDIT: That's true about the incorrect result (and the maintenance dev :) ).

Here's a revision:

    public static int FindFirstInteger(this IEnumerable<char> array)
    {
        bool foundInteger = false;
        var ints = new List<char>();

        foreach (var c in array)
        {
            if(char.IsNumber(c))
            {
                foundInteger = true;
                ints.Add(c);
            }
            else
            {
                if(foundInteger)
                {
                    break;
                }
            }
        }

        string s = string.Empty;
        ints.ForEach(i => s += i.ToString());
        return int.Parse(s);
    }
0
13.10.2009 16:59:41
That's pretty clever. Of course the maintenance dev will hate you.
AngryHacker 13.10.2009 16:31:00
That would give an incorrect result for numbers longer than 1 digit.
finnw 13.10.2009 16:35:17
    private string GetInt(string s)
    {
        int i = 0;

        s = s.Trim();
        while (i<s.Length && char.IsDigit(s[i])) i++;

        return s.Substring(0, i);
    }
0
13.10.2009 18:05:52
I am not the downvoter, but I would guess it's because you do a linear search of the 'nums' list instead of the simpler 'char.IsNumber(s[i])'.
finnw 13.10.2009 17:09:02
I guessed that also, but I wasn't aware it existed... anyway glad I learned something and took -1 in the figure ;)
manji 13.10.2009 18:08:18

Might as well add mine too.

        string temp = " 3 .x£";
        string numbersOnly = String.Empty;
        int tempInt;
        for (int i = 0; i < temp.Length; i++)
        {
            if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
            {
                numbersOnly += temp[i];
            }
        }

        Int32.TryParse(numbersOnly, out tempInt);
        MessageBox.Show(tempInt.ToString());

The message box is just for testing purposes, just delete it once you verify the method is working.

0
13.10.2009 16:30:12

You can use Linq to do this, no Regular Expressions needed:

public static int GetLeadingInt(string input)
{
   return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}

This works for all your provided examples:

string[] tests = new string[] {
   "1",
   " 42 ",
   " 3 -.X.-",
   " 2 3 4 5"
};

foreach (string test in tests)
{
   Console.WriteLine("Result: " + GetLeadingInt(test));
}
24
13.10.2009 16:46:07
Why are you calling ToCharArray? String already implements IEnumerable<char>.
Jon Skeet 13.10.2009 16:33:38
I love it! Thanks for the nice solution.
Chris Martin 13.10.2009 17:00:35
Nice solution. One question... is the || c == '.' actually needed? The examples don't show anything but integer results. If removed it would speed it up a bit which may be significant if there are many extractions.
MQuiggGeorgia 29.06.2017 15:03:08
This is pretty inefficient, creating at least four intermediate objects for an operation that can be done with zero.
Tim Sylvester 21.12.2018 18:55:36

This is how I would have done it in Java:

int parseLeadingInt(String input)
{
    NumberFormat fmt = NumberFormat.getIntegerInstance();
    fmt.setGroupingUsed(false);
    return fmt.parse(input, new ParsePosition(0)).intValue();
}

I was hoping something similar would be possible in .NET.

This is the regex-based solution I am currently using:

int? parseLeadingInt(string input)
{
    int result = 0;
    Match match = Regex.Match(input, "^[ \t]*\\d+");
    if (match.Success && int.TryParse(match.Value, out result))
    {
        return result;
    }
    return null;
}
1
14.10.2009 09:21:53

I like @Donut's approach.

I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).

Three example implementions:

To remove trailing non-digit characters:

var digits = new string(input.Trim().TakeWhile(c =>
    ("0123456789").Contains(c)
).ToArray());

To remove leading non-digit characters:

var digits = new string(input.Trim().SkipWhile(c =>
    !("0123456789").Contains(c)
).ToArray());

To remove all non-digit characters:

var digits = new string(input.Trim().Where(c =>
    ("0123456789").Contains(c)
).ToArray());

And of course: int.Parse(digits) or int.TryParse(digits, out output)

5
23.05.2017 12:13:54
IMHO slightly more efficient to replace ("0123456789").Contains(c) with c >= '0' && c <= '9'
ToolmakerSteve 26.12.2019 22:34:39

This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:

for (int p = input.Length;  p > 0;  p--)
{
    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);

Of course, this will be slow if input is very long.

ADDENDUM (March 2016)

This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:

for (int p = input.Length;  p > 0;  p--)
{
    char  ch;
    do
    {
        ch = input[--p];
    } while ((ch < '0'  ||  ch > '9')  &&  ch != ' '  &&  p > 0);
    p++;

    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);
1
4.03.2016 18:38:25