English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Points to Note in Pandas

Pandas-Hinweise und Fallen

In Pandas mit If verwenden/Truth-Anweisung

Wenn Sie mit dem Booleschen Operator if oder when, or oder or not versuchen, Inhalte in bool zu konvertieren, kann manchmal ein Fehler ausgelöst werden. Wie dieser Fehler auftritt, ist derzeit unbekannt. Pandas wirft eine ValueError-Ausnahme aus.

 import pandas as pd
 if pd.Series([False, True, False]):
    print 'I am True'

The running results are as follows:

 ValueError: The truth value of a Series is ambiguous. 
 Verwenden Sie a.empty, a.bool(), a.item(), a.any() oder a.all().

In diesem Fall ist es unklar, wie damit umgegangen werden soll. Diese Fehlermeldung deutet darauf hin, dass None oder einer der beiden verwendet wurde.

 import pandas as pd
 if pd.Series([False, True, False]).any():
    print("I am any")

The running results are as follows:

I am any

Um einen einzelnen Pandas-Objekt im Booleschen Kontext zu bewerten, verwenden Sie bitte die .bool()-Methode-

import pandas as pd
print pd.Series([True]).bool()

The running results are as follows:

True

Bit-Boolesche Werte

Boolesche Bit-Operationen wie == und ! geben eine Boolesche Serie zurück, was fast immer erforderlich ist.

 import pandas as pd
 s = pd.Series(range(5))
 print s==4

The running results are as follows:

 0 False
 1 False
 2 False
 3 False
 4 True
 dtype: bool

isin-Operation

Dies gibt eine Boolesche Serie zurück, die anzeigt, ob jeder Element der Booleschen Werte vollständig in der übergebenen Wertesequenz enthalten ist.

 import pandas as pd
 s = pd.Series(list('abc'))
 s = s.isin(['a', 'c', 'e'])
 print s

The running results are as follows:

 0 True
 1 False
 2 True
 dtype: bool

Neuindizierung vs ix-Index

Viele Benutzer finden, dass sie die ix-Indexfunktion als eine einfache Methode zur Auswahl von Daten aus Pandas-Objekten verwenden:

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(6, 4), columns = ['one', 'two', 'three',
 'four'], index = list('abcdef'))
 print df
 print df.ix[['b', 'c', 'e']]

The running results are as follows:

        one        two      three       four
a   -1.582025   1.335773   0.961417  -1.272084
b    1.461512   0.111372  -0.072225   0.553058
c   -1.240671   0.762185   1.511936  -0.630920
d   -2.380648  -0.029981   0.196489   0.531714
e    1.846746   0.148149   0.275398  -0.244559
f   -1.842662  -0.933195   2.303949   0.677641
          one        two      three       four
b    1.461512   0.111372  -0.072225   0.553058
c   -1.240671   0.762185   1.511936  -0.630920
e    1.846746   0.148149   0.275398  -0.244559

Of course, in this case, this is completely equivalent to using the reindex method:

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(6, 4), columns = ['one', 'two', 'three',
 'four'], index = list('abcdef'))
 print df
 print df.reindex(['b', 'c', 'e'])

The running results are as follows:

        one        two      three       four
a    1.639081   1.369838   0.261287  -1.662003
b   -0.173359   0.242447  -0.494384   0.346882
c   -0.106411   0.623568   0.282401  -0.916361
d   -1.078791  -0.612607  -0.897289  -1.146893
e    0.465215   1.552873  -1.841959   0.329404
f    0.966022  -0.190077   1.324247   0.678064
          one        two      three       four
b   -0.173359   0.242447  -0.494384   0.346882
c   -0.106411   0.623568   0.282401  -0.916361
e    0.465215   1.552873  -1.841959   0.329404

One might conclude that ix and reindex are based on this100% equivalent. This is the case except for integer indexing. For example, the above operation can also be expressed as:

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(6, 4), columns = ['one', 'two', 'three',
 'four'], index = list('abcdef'))
 print df
 print df.ix[[1, 2, 4])
 print df.reindex([1, 2, 4])

The running results are as follows:

        one        two      three       four
a   -1.015695  -0.553847   1.106235  -0.784460
b   -0.527398  -0.518198  -0.710546  -0.512036
c   -0.842803  -1.050374   0.787146   0.205147
d   -1.238016  -0.749554  -0.547470  -0.029045
e   -0.056788   1.063999  -0.767220   0.212476
f    1.139714   0.036159   0.201912   0.710119
          one        two      three       four
b   -0.527398  -0.518198  -0.710546  -0.512036
c   -0.842803  -1.050374   0.787146   0.205147
e   -0.056788   1.063999  -0.767220   0.212476
    one  two  three  four
1   NaN  NaN    NaN   NaN
2   NaN  NaN    NaN   NaN
4   NaN  NaN    NaN   NaN

It is important to remember that reindexing is only strict label indexing. In cases where the index contains errors such as integers and strings, this may lead to some unexpected results.