aggregate
Similar to dict, set in Python is also a set of keys, but does not store value. Since keys cannot be repeated, there are no duplicate keys in set.
Note that the key is an immutable type, that is, the hash value.
num = {} print(type(num)) # <class 'dict'> num = {1, 2, 3, 4} print(type(num)) # <class 'set'>
1. Creation of collections
- Create objects before adding elements.
- When creating an empty collection, only s = set() can be used because s = {} creates an empty dictionary.
basket = set() basket.add('apple') basket.add('banana') print(basket) # {'banana', 'apple'}
- Directly enclose a pile of elements in curly braces {element 1, element 2,..., element n}.
- Duplicate elements are automatically filtered in the set.
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'} print(basket) # {'orange', 'apple', 'pear', 'banana'}
- Use the set(value) factory function to convert a list or tuple into a set.
a = set('abracadabra') print(a) # {'r', 'b', 'd', 'c', 'a'} b = set(("Google", "Laoguo", "Taobao", "Taobao")) print(b) # {'Taobao', 'Laoguo', 'Google'} c = set(["Google", "Laoguo", "Taobao", "Google"]) print(c) # {'Taobao', 'Laoguo', 'Google'}
# Remove duplicate elements from the list lst = [0, 1, 2, 3, 4, 5, 5, 3, 1] temp = [] for item in lst: if item not in temp: temp.append(item) print(temp) # [0, 1, 2, 3, 4, 5] a = set(lst) print(list(a)) # [0, 1, 2, 3, 4, 5]
From the results, we find two characteristics of sets: unordered and unique.
Because set stores unordered sets, we cannot create indexes or perform slice operations for the set, and there are no keys to obtain the values of elements in the set, but we can judge whether an element is in the set.
2. Access the values in the collection
- You can use the len() built-in function to get the size of the collection.
s = set(['Google', 'Baidu', 'Taobao']) print(len(s)) # 3
- You can use for to read the data in the set one by one.
s = set(['Google', 'Baidu', 'Taobao']) for item in s: print(item) # Baidu # Google # Taobao
- You can judge whether an element already exists in the collection by in or not in
s = set(['Google', 'Baidu', 'Taobao']) print('Taobao' in s) # True print('Facebook' not in s) # True
3. Built in method of collection
- set.add(elmnt) is used to add elements to the collection. If the added elements already exist in the collection, no operation will be performed.
fruits = {"apple", "banana", "cherry"} fruits.add("orange") print(fruits) # {'orange', 'cherry', 'banana', 'apple'} fruits.add("apple") print(fruits) # {'orange', 'cherry', 'banana', 'apple'}
- set.update(set) is used to modify the current set. You can add a new element or set to the current set. If the added element already exists in the set, the element will only appear once, and the repeated elements will be ignored.
x = {"apple", "banana", "cherry"} y = {"google", "baidu", "apple"} x.update(y) print(x) # {'cherry', 'banana', 'apple', 'google', 'baidu'} y.update(["huawei", "xiaomi"]) print(y) # {'huawei', 'apple', 'baidu', 'xiaomi', 'google'}
- set.remove(item) is used to remove the specified element from the collection. If the element does not exist, an error occurs.
fruits = {"apple", "banana", "cherry"} fruits.remove("banana") print(fruits) # {'apple', 'cherry'}
- set.discard(value) is used to remove the specified set element. The remove() method will cause an error when removing a non-existent element, while the discard() method will not.
fruits = {"apple", "banana", "cherry"} fruits.discard("banana") print(fruits) # {'apple', 'cherry'}
- set.pop() is used to randomly remove an element.
fruits = {"apple", "banana", "cherry"} x = fruits.pop() print(fruits) # {'cherry', 'apple'} print(x) # banana
Because set is a set of unordered and non repeating elements, two or more sets can do set operations in the mathematical sense.
- set.intersection(set1, set2) returns the intersection of two sets.
- Set1 & set2 returns the intersection of two sets.
- set.intersection_update(set1, set2) intersection to remove non overlapping elements from the original set.
a = set('abracadabra') b = set('alacazam') print(a) # {'r', 'a', 'c', 'b', 'd'} print(b) # {'c', 'a', 'l', 'm', 'z'} c = a.intersection(b) print(c) # {'a', 'c'} print(a & b) # {'c', 'a'} print(a) # {'a', 'r', 'c', 'b', 'd'} a.intersection_update(b) print(a) # {'a', 'c'}
- set.union(set1, set2) returns the union of two sets.
- set1 | set2 returns the union of two sets.
a = set('abracadabra') b = set('alacazam') print(a) # {'r', 'a', 'c', 'b', 'd'} print(b) # {'c', 'a', 'l', 'm', 'z'} print(a | b) # {'l', 'd', 'm', 'b', 'a', 'r', 'z', 'c'} c = a.union(b) print(c) # {'c', 'a', 'd', 'm', 'r', 'b', 'z', 'l'}
- set.difference(set) returns the difference set of the set.
- set1 - set2 returns the difference set of the set.
- set.difference_update(set) is the difference set of the set. The element is directly removed from the original set without return value.
a = set('abracadabra') b = set('alacazam') print(a) # {'r', 'a', 'c', 'b', 'd'} print(b) # {'c', 'a', 'l', 'm', 'z'} c = a.difference(b) print(c) # {'b', 'd', 'r'} print(a - b) # {'d', 'b', 'r'} print(a) # {'r', 'd', 'c', 'a', 'b'} a.difference_update(b) print(a) # {'d', 'r', 'b'}
a = set('abracadabra') b = set('alacazam') print(a) # {'d', 'a', 'b', 'c', 'r'} print(b) # {'c', 'a', 'l', 'm', 'z'} d = b.difference(a) print(d) # {'m', 'z', 'l'} print(b - a) # {'m', 'z', 'l'} print(b) # {'r', 'd', 'c', 'a', 'b'} b.difference_update(a) print(b) # {'m', 'z', 'l'}
- set.symmetric_difference(set) returns the XOR of the set.
- set1 ^ set2 returns the XOR of the set.
- set.symmetric_difference_update(set) removes the same elements in another specified set in the current set, and inserts different elements in another specified set into the current set.
a = set('abracadabra') b = set('alacazam') print(a) # {'r', 'a', 'c', 'b', 'd'} print(b) # {'c', 'a', 'l', 'm', 'z'} c = a.symmetric_difference(b) print(c) # {'m', 'r', 'l', 'b', 'z', 'd'} print(a ^ b) # {'m', 'r', 'l', 'b', 'z', 'd'} print(a) # {'r', 'd', 'c', 'a', 'b'} a.symmetric_difference_update(b) print(a) # {'r', 'b', 'm', 'l', 'z', 'd'}
- set.issubset(set) determines whether the set is contained by other sets. If yes, it returns True; otherwise, it returns False.
- Set1 < = set2 determines whether the set is included by other sets. If so, it returns True; otherwise, it returns False.
x = {"a", "b", "c"} y = {"f", "e", "d", "c", "b", "a"} z = x.issubset(y) print(z) # True print(x <= y) # True x = {"a", "b", "c"} y = {"f", "e", "d", "c", "b"} z = x.issubset(y) print(z) # False print(x <= y) # False
- set.issuperset(set) is used to judge whether the set contains other sets. If yes, it returns True; otherwise, it returns False.
- Set1 > = set2 determines whether the set contains other sets. If so, it returns True; otherwise, it returns False.
x = {"f", "e", "d", "c", "b", "a"} y = {"a", "b", "c"} z = x.issuperset(y) print(z) # True print(x >= y) # True x = {"f", "e", "d", "c", "b"} y = {"a", "b", "c"} z = x.issuperset(y) print(z) # False print(x >= y) # False
- set.isdisjoint(set) is used to judge whether two sets do not intersect. If yes, it returns True; otherwise, it returns False.
x = {"f", "e", "d", "c", "b"} y = {"a", "b", "c"} z = x.isdisjoint(y) print(z) # False x = {"f", "e", "d", "m", "g"} y = {"a", "b", "c"} z = x.isdisjoint(y) print(z) # True
4. Set conversion
se = set(range(4)) li = list(se) tu = tuple(se) print(se, type(se)) # {0, 1, 2, 3} <class 'set'> print(li, type(li)) # [0, 1, 2, 3] <class 'list'> print(tu, type(tu)) # (0, 1, 2, 3) <class 'tuple'>
5. Immutable set
Python provides an implementation version that cannot change the collection of elements, that is, elements cannot be added or deleted. The type is called frozenset. It should be noted that frozenset can still perform collection operations, but it can't use methods with update,add,pop, etc.
- frozenset([iterable]) returns a frozen collection. After freezing, no elements can be added or deleted from the collection.
a = frozenset(range(10)) # Generate a new immutable set print(a) # frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}) b = frozenset('laoguo') print(b) # frozenset({'o', 'a', 'l', 'g', 'u'})
sequence
In Python, sequence types include string, list, tuple, set and dictionary. These sequences support some general operations, but in particular, set and dictionary do not support index, slice, addition and multiplication operations.
Built in functions for sequences
- list(sub) converts an iteratable object into a list.
a = list() print(a) # [] b = 'I Love Python' b = list(b) print(b) # ['I', ' ', 'L', 'o', 'v', 'e', ' ', 'P', 'y', 't', 'h', 'o', 'n'] c = (1, 1, 2, 3, 5, 8) c = list(c) print(c) # [1, 1, 2, 3, 5, 8]
- tuple(sub) converts an iteratable object into a tuple.
a = tuple() print(a) # () b = 'I Love Python' b = tuple(b) print(b) # ('I', ' ', 'L', 'o', 'v', 'e', ' ', 'P', 'y', 't', 'h', 'o', 'n') c = [1, 1, 2, 3, 5, 8] c = tuple(c) print(c) # (1, 1, 2, 3, 5, 8)
- str(obj) converts obj objects into strings.
a = 123 a = str(a) print(a) # 123
- len(s) returns the length of an object (character, list, tuple, etc.) or the number of elements.
- s – object.
a = list() print(len(a)) # 0 b = ('I', ' ', 'L', 'o', 'v', 'e', ' ', 'P', 'y', 't', 'h', 'o', 'n') print(len(b)) # 13 c = 'I Love Python' print(len(c)) # 13
- max(sub) returns the maximum value in the sequence or parameter set.
print(max(1, 2, 3, 4, 5)) # 5 print(max([-8, 99, 3, 7, 83])) # 99 print(max('IlovePython')) # y
- min(sub) returns the minimum value in the sequence or parameter set
print(min(1, 2, 3, 4, 5)) # 1 print(min([-8, 99, 3, 7, 83])) # -8 print(min('IlovePython')) # I
- sum(iterable[, start=0]) returns the sum of the sequence iterable and the optional parameter start.
print(sum([1, 3, 5, 7, 9])) # 25 print(sum([1, 3, 5, 7, 9], 10)) # 35 print(sum((1, 3, 5, 7, 9))) # 25 print(sum((1, 3, 5, 7, 9), 20)) # 45
- sorted(iterable, key=None, reverse=False) sorts all iteratable objects.
- Iteratable – iteratable object.
- key – it is mainly used to compare elements. There is only one parameter. The parameters of the specific function are taken from the iteratable object. Specify an element in the iteratable object to sort.
- Reverse – collation, reverse = True descending, reverse = False ascending (default).
- Returns a reordered list.
x = [-8, 99, 3, 7, 83] print(sorted(x)) # [-8, 3, 7, 83, 99] print(sorted(x, reverse=True)) # [99, 83, 7, 3, -8] t = ({"age": 20, "name": "a"}, {"age": 25, "name": "b"}, {"age": 10, "name": "c"}) x = sorted(t, key=lambda a: a["age"]) print(x) # [{'age': 10, 'name': 'c'}, {'age': 20, 'name': 'a'}, {'age': 25, 'name': 'b'}]
- The reversed(seq) function returns an inverted iterator.
- seq – the sequence to be converted, which can be tuple, string, list or range.
s = 'python' x = reversed(s) print(type(x)) # <class 'reversed'> print(x) # <reversed object at 0x00000139445C0160> print(list(x)) ['n', 'o', 'h', 't', 'y', 'p'] t = ('n', 'o', 'h', 't', 'y', 'p') print(list(reversed(t))) # ['p', 'y', 't', 'h', 'o', 'n'] r = range(5, 9) print(list(reversed(r))) # [8, 7, 6, 5] x = [-8, 99, 3, 7, 83] print(list(reversed(x))) # [83, 7, 3, 99, -8]
- enumerate(sequence, [start=0])
- It is used to combine a traversable data object (such as list, tuple or string) into an index sequence, and list data and data subscripts at the same time. It is generally used in the for loop.
seasons = ['Spring', 'Summer', 'Fall', 'Winter'] a = list(enumerate(seasons)) print(a) # [(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')] b = list(enumerate(seasons, 1)) print(b) # [(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')] for i, element in a: print('{0},{1}'.format(i, element)) # 0,Spring # 1,Summer # 2,Fall # 3,Winter
- zip(iter1 [,iter2 [...]])
- It is used to take the iteratable object as a parameter, package the corresponding elements in the object into tuples, and then return the object composed of these tuples. The advantage of this is to save a lot of memory.
- We can use the list() transformation to output the list.
- If the number of elements of each iterator is inconsistent, the length of the returned list is the same as that of the shortest object. The tuple can be decompressed into a list by using the * operator
a = [1, 2, 3] b = [4, 5, 6] c = [4, 5, 6, 7, 8] zipped = zip(a, b) print(zipped) # <zip object at 0x000000C5D89EDD88> print(list(zipped)) # [(1, 4), (2, 5), (3, 6)] zipped = zip(a, c) print(list(zipped)) # [(1, 4), (2, 5), (3, 6)] a1, a2 = zip(*zip(a, b)) print(list(a1)) # [1, 2, 3] print(list(a2)) # [4, 5, 6]