How do I append one string to another in Python?

1 Answer

answered Dec 4, 2020 by hari_sh (26.4k points)
edited Dec 9, 2020 by hari_sh

Let's assume you have one reference to a string and you're trying to concatenate another string to the end of that string, Now Cpython will attempts to extend the string in place.

At the end, the result we get is that the operation is amortized O(n).

Example:

s = ""
for i in range(n):
s+=str(i)
It is used to be O(n^2).
From the source (bytesobject.c):
void
PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
{
PyBytes_Concat(pv, w);
Py_XDECREF(w);
}
/* The following function breaks the notion that strings are immutable:
it changes the size of a string. We get away with this only if there
is only one module referencing the object. You can also think of it
as creating a new string object and destroying the old one, only
more efficiently. In any case, don't use this if the string may
already be known to some other part of the code...
Note that if there's not enough memory to resize the string, the original
string object at *pv is deallocated, *pv is set to NULL, an "out of
memory" exception is set, and -1 is returned. Else (on success) 0 is
returned, and the value in *pv may or may not be the same as on input.
As always, an extra byte is allocated for a trailing \0 byte (newsize
does *not* include that), and a trailing \0 byte is stored.
*/
int
_PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
{
register PyObject *v;
register PyBytesObject *sv;
v = *pv;
if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
*pv = 0;
Py_DECREF(v);
PyErr_BadInternalCall();
return -1;
}
/* XXX UNREF/NEWREF interface should be more symmetrical */
_Py_DEC_REFTOTAL;
_Py_ForgetReference(v);
*pv = (PyObject *)
PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
if (*pv == NULL) {
PyObject_Del(v);
PyErr_NoMemory();
return -1;
}
_Py_NewReference(*pv);
sv = (PyBytesObject *) *pv;
Py_SIZE(sv) = newsize;
sv->ob_sval[newsize] = '\0';
sv->ob_shash = -1; /* invalidate cached hash value */
return 0;
}

Now it's very easy for verifying empirically.

$ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
1000000 loops, best of 3: 1.85 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
10000 loops, best of 3: 16.8 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
10000 loops, best of 3: 158 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
1000 loops, best of 3: 1.71 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 14.6 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
10 loops, best of 3: 173 msec per loop

It's important however to note that this optimisation isn't part of the Python spec. It's just in the cPython execution apparently. The same empirical testing on pypy or jython for example might show the older O(n**2) performance .

$ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
10000 loops, best of 3: 90.8 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
1000 loops, best of 3: 896 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
100 loops, best of 3: 9.03 msec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
10 loops, best of 3: 89.5 msec per loop

$ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 12.8 sec per loop

From this, pypy is accomplishing something that functions admirably with short strings, yet performs ineffectively for bigger strings.

Want to learn more about python? Come and join: python course

Related questions

0 votes

2 answers

How do I append one string to another in Python?

asked Jul 3, 2019 in Python by Sammy (47.6k points)

0 votes

1 answer

In Java, how to append a string more efficiently?

asked Feb 25, 2021 in Java by dante07 (13.1k points)

+3 votes

2 answers

How do you append to a file in Python?

asked May 26, 2019 in Python by Ritik (3.5k points)

0 votes

1 answer

Append values to a set in Python

asked Jul 22, 2019 in Python by Eresh Kumar (45.3k points)

0 votes

1 answer

Using Python to append large set of data frames to an existing excel file

asked Jul 25, 2019 in BI by Vaibhav Ameta (17.6k points)

Browse Categories

...