python 常用的案例1

标签：常用 return python fields self 案例 dict print def

python

Python中文转拼音代码(支持全拼和首字母缩写)

by Crazyant

本文的代码，从https://github.com/cleverdeng/pinyin.py升级得来，针对原文的代码，做了以下升级：

1、可以传入参数           firstcode：如果为           true，只取汉子的第一个拼音字母；如果为           false，则会输出全部拼音；          


2、修复：如果为英文字母，则直接输出；          


3、修复：如果分隔符为空字符串，仍然能正常输出；          


4、升级：可以指定词典的文件路径

代码很简单，直接读取了一个词典（字符和英文的映射），然后挨个替换中文中的拼音即可；

Python

#!/usr/bin/env python


# -*- coding:utf-8 -*-


                      


"""


原版代码：https://github.com/cleverdeng/pinyin.py


                      


新增功能：


    1、可以传入参数firstcode：如果为true，只取汉子的第一个拼音字母；如果为false，则会输出全部拼音；


    2、修复：如果为英文字母，则直接输出；


    3、修复：如果分隔符为空字符串，仍然能正常输出；


    4、升级：可以指定词典的文件路径


"""


                      


__version__           =           '0.9'


__all__           =           [           "PinYin"           ]


                      


import           os.path


                      


                      


class           PinYin           (           object           )           :


               def           __init__           (           self           )           :


                   self           .           word_dict           =           {           }


                      


               def           load_word           (           self           ,           dict_file           )           :


                   self           .           dict_file           =           dict_file 


                   if           not           os.path           .           exists           (           self           .           dict_file           )           :


                       raise           IOError           (           "NotFoundFile"           )


                      


                   with           file           (           self           .           dict_file           )           as           f_obj           :


                       for           f_line            in           f_obj           .           readlines           (           )           :


                           try           :


                               line           =           f_line           .           split           (           '    '           )


                               self           .           word_dict           [           line           [           0           ]           ]           =           line           [           1           ]


                           except           :


                               line           =           f_line           .           split           (           '   '           )


                               self           .           word_dict           [           line           [           0           ]           ]           =           line           [           1           ]


                      


                      


               def           hanzi2pinyin           (           self           ,           string           =           ""           ,           firstcode           =           False           )           :


                   result           =           [           ]


                   if           not           isinstance           (           string           ,           unicode           )           :


                       string           =           string           .           decode           (           "utf-8"           )


        


                   for           char            in           string           :


                       key           =           '%X'           %           ord           (           char           )


                       value           =           self           .           word_dict           .           get           (           key           ,           char           )


                       outpinyin           =           str           (           value           )           .           split           (           )           [           0           ]           [           :           -           1           ]           .           lower           (           )


                       if           not           outpinyin           :


                           outpinyin           =           char


                       if           firstcode           :


                           result           .           append           (           outpinyin           [           0           ]           )


                       else           :


                           result           .           append           (           outpinyin           )


                      


                   return           result


                      


                      


               def           hanzi2pinyin_split           (           self           ,           string           =           ""           ,           split           =           ""           ,           firstcode           =           False           )           :


                   """提取中文的拼音


        @param string:要提取的中文


        @param split:分隔符


        @param firstcode: 提取的是全拼还是首字母？如果为true表示提取首字母，默认为False提取全拼  


        """


                   result           =           self           .           hanzi2pinyin           (           string           =           string           ,           firstcode           =           firstcode           )


                   return           split           .           join           (           result           )


                      


                      


if           __name__           ==           "__main__"           :


               test           =           PinYin           (           )


               test           .           load_word           (           'word.data'           )


               string           =           "Java程序性能优化-让你的Java程序更快更稳定"


               print           "in: %s"           %           string


               print           "out: %s"           %           str           (           test           .           hanzi2pinyin           (           string           =           string           )           )


               print           "out: %s"           %           test           .           hanzi2pinyin_split           (           string           =           string           ,           split           =           ""           ,           firstcode           =           True           )


               print           "out: %s"           %           test           .           hanzi2pinyin_split           (           string           =           string           ,           split           =           ""           ,           firstcode           =           False           )

实例中main函数的代码输出结果

代码使用方法：

如果需要其他的提取，可以修改一下代码实现；

代码（包含词典）打包下载：

Posted in: python

Python使用list字段模式或者dict字段模式读取文件的方法

Python用于处理文本数据绝对是个利器，极为简单的读取、分割、过滤、转换支持，使得开发者不需要考虑繁杂的流文件处理过程（相对于JAVA来说的，嘻嘻）。博主自己工作中，一些复杂的文本数据处理计算，包括在HADOOP上编写Streaming程序，均是用Python完成。

而在文本处理的过程中，将文件加载内存中是第一步，这就涉及到怎样将文件中的某一列映射到具体的变量的过程，最最愚笨的方法，就是按照字段的下标进行引用，比如这样子：

将文件行映射到各个字段最愚笨的方法

Python

# fields是读取了一行，并且按照分隔符分割之后的列表


user_id           =           fields           [           0           ]


user_name           =           fields           [           1           ]


user_type           =           fields           [           2           ]

如果按照这种方式读取，一旦文件有顺序、增减列的变动，代码的维护是个噩梦，这种代码一定要杜绝。

本文推荐两种优雅的方式来读取数据，都是先配置字段模式，然后按照模式读取，而模式则有字典模式和列表模式两种形式；

读取文件，按照分隔符分割成字段数据列表

首先读取文件，按照分隔符分割每一行的数据，返回字段列表，以便后续处理。

代码如下：

读取文件并进行分割的函数

def            read_file_data           (           filepath           )           :


               ''           '根据路径按行读取文件, 参数filepath：文件的绝对路径


    @param filepath: 读取文件的路径


    @return: 按\t分割后的每行的数据列表


    '           ''


               fin           =           open           (           filepath           ,           'r'           )


               for           line            in           fin           :


                   try           :


                       line           =           line           [           :           -           1           ]


                       if           not           line           :           continue


                   except           :


                       continue


        


                   try           :


                       fields           =           line           .           split           (           "\t"           )


                   except           :


                       continue


                   # 抛出当前行的分割列表


                   yield            fields


               fin           .           close           (           )

使用yield关键字，每次抛出单个行的分割数据，这样在调度程序中可以用for fields in read_file_data(fpath)的方式读取每一行。

映射到模型之方法1：使用配置好的字典模式，装配读取的数据列表

这种方法配置一个{“字段名”: 字段位置}的字典作为数据模式，然后按照该模式装配读取的列表数据，最后实现用字典的方式访问数据。

所使用的函数：

用字典模式装配数据列表以实现按KEY读取

Python

@           staticmethod


def           map_fields_dict_schema           (           fields           ,           dict_schema           )           :


               """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name':0, 'age':1}，那么就返回{'name':'a','age':'b'}


    @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到


    @param dict_schema: 一个词典，key是字段名称，value是字段的位置；


    @return: 词典，key是字段名称，value是字段值


    """


               pdict           =           {           }


               for           fstr           ,           findex            in           dict_schema           .           iteritems           (           )           :


                   pdict           [           fstr           ]           =           str           (           fields           [           int           (           findex           )           ]           )


               return           pdict

有了该方法和之前的方法，可以用以下的方式读取数据：

用字典模式读取数据实例

Python

# coding:utf8


"""


@author: www.crazyant.net


测试使用字典模式加载数据列表


优点：对于多列文件，只通过配置需要读取的字段，就能读取对应列的数据


缺点：如果字段较多，每个字段的位置配置，较为麻烦


"""


import           file_util


import           pprint


                      


# 配置好的要读取的字典模式，可以只配置自己关心的列的位置


dict_schema           =           {           "userid"           :           0           ,           "username"           :           1           ,           "usertype"           :           2           }


for           fields            in           file_util           .           FileUtil           .           read_file_data           (           "userfile.txt"           )           :


               # 将字段列表，按照字典模式进行映射


               dict_fields           =           file_util           .           FileUtil           .           map_fields_dict_schema           (           fields           ,           dict_schema           )


               pprint           .           pprint           (           dict_fields           )

输出结果：

字典模式加载后的字典数据

Python

{           'userid'           :           '1'           ,           'username'           :           'name1'           ,           'usertype'           :           '0'           }


{           'userid'           :           '2'           ,           'username'           :           'name2'           ,           'usertype'           :           '1'           }


{           'userid'           :           '3'           ,           'username'           :           'name3'           ,           'usertype'           :           '2'           }


{           'userid'           :           '4'           ,           'username'           :           'name4'           ,           'usertype'           :           '3'           }


{           'userid'           :           '5'           ,           'username'           :           'name5'           ,           'usertype'           :           '4'           }


{           'userid'           :           '6'           ,           'username'           :           'name6'           ,           'usertype'           :           '5'           }


{           'userid'           :           '7'           ,           'username'           :           'name7'           ,           'usertype'           :           '6'           }


{           'userid'           :           '8'           ,           'username'           :           'name8'           ,           'usertype'           :           '7'           }


{           'userid'           :           '9'           ,           'username'           :           'name9'           ,           'usertype'           :           '8'           }


{           'userid'           :           '10'           ,           'username'           :           'name10'           ,           'usertype'           :           '9'           }


{           'userid'           :           '11'           ,           'username'           :           'name11'           ,           'usertype'           :           '10'           }


{           'userid'           :           '12'           ,           'username'           :           'name12'           ,           'usertype'           :           '11'           }

映射到模型之方法2：使用配置好的列表模式，装配读取的数据列表

如果需要读取文件所有列，或者前面的一些列，那么配置字典模式优点复杂，因为需要给每个字段配置索引位置，并且这些位置是从0开始完后数的，属于低级劳动，需要消灭。

列表模式应命运而生，先将配置好的列表模式转换成字典模式，然后按字典加载就可以实现。

转换模式，以及用按列表模式读取的代码：

用列表模式读取数据的方法

Python

@           staticmethod


def           transform_list_to_dict           (           para_list           )           :


               """把['a', 'b']转换成{'a':0, 'b':1}的形式


    @param para_list: 列表，里面是每个列对应的字段名


    @return: 字典，里面是字段名和位置的映射


    """


               res_dict           =           {           }


               idx           =           0


               while           idx           <           len           (           para_list           )           :


                   res_dict           [           str           (           para_list           [           idx           ]           )           .           strip           (           )           ]           =           idx


                   idx           +=           1


               return           res           _dict          


                      


@           staticmethod


def           map_fields_list_schema           (           fields           ,           list_schema           )           :


               """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name', 'age'}，那么就返回{'name':'a','age':'b'}


    @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到


    @param list_schema: 列名称的列表list


    @return: 词典，key是字段名称，value是字段值


    """


               dict_schema           =           FileUtil           .           transform_list_to_dict           (           list_schema           )


               return           FileUtil           .           map_fields_dict_schema           (           fields           ,           dict_schema           )

使用的时候，可以用列表的形式配置模式，不需要配置索引更加简洁：

使用列表模式读取数据的调用的代码

Python

# coding:utf8


"""


@author: www.crazyant.net


测试使用列表模式加载数据列表


优点：如果读取所有列，用列表模式只需要按顺序写出各个列的字段名就可以


缺点：不能够只读取关心的字段，需要全部读取


"""


import           file_util


import           pprint


                      


# 配置好的要读取的列表模式，只能配置前面的列，或者所有咧


list_schema           =           [           "userid"           ,           "username"           ,           "usertype"           ]


for           fields            in           file_util           .           FileUtil           .           read_file_data           (           "userfile.txt"           )           :


               # 将字段列表，按照字典模式进行映射


               dict_fields           =           file_util           .           FileUtil           .           map_fields_list_schema           (           fields           ,           list_schema           )


               pprint           .           pprint           (           dict_fields           )

运行结果和字典模式的完全一样。

file_util.py全部代码

以下是file_util.py中的全部代码，可以放在自己的公用类库中使用

file_util.py

Python

# -*- encoding:utf8 -*-


'''


@author: www.crazyant.net


@version: 2014-12-5


'''


                      


class           FileUtil           (           object           )           :


               '''文件、路径常用操作方法


    '''


               @           staticmethod


               def           read_file_data           (           filepath           )           :


                   '''根据路径按行读取文件, 参数filepath：文件的绝对路径


        @param filepath: 读取文件的路径


        @return: 按\t分割后的每行的数据列表


        '''


                   fin           =           open           (           filepath           ,           'r'           )


                   for           line            in           fin           :


                       try           :


                           line           =           line           [           :           -           1           ]


                           if           not           line           :           continue


                       except           :


                           continue


            


                       try           :


                           fields           =           line           .           split           (           "\t"           )


                       except           :


                           continue


                       # 抛出当前行的分割列表


                       yield           fields


                   fin           .           close           (           )


    


               @           staticmethod


               def           transform_list_to_dict           (           para_list           )           :


                   """把['a', 'b']转换成{'a':0, 'b':1}的形式


        @param para_list: 列表，里面是每个列对应的字段名


        @return: 字典，里面是字段名和位置的映射


        """


                   res_dict           =           {           }


                   idx           =           0


                   while           idx           <           len           (           para_list           )           :


                       res_dict           [           str           (           para_list           [           idx           ]           )           .           strip           (           )           ]           =           idx


                       idx           +=           1


                   return           res           _dict          


    


               @           staticmethod


               def           map_fields_list_schema           (           fields           ,           list_schema           )           :


                   """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name', 'age'}，那么就返回{'name':'a','age':'b'}


        @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到


        @param list_schema: 列名称的列表list


        @return: 词典，key是字段名称，value是字段值


        """


                   dict_schema           =           FileUtil           .           transform_list_to_dict           (           list_schema           )


                   return           FileUtil           .           map_fields_dict_schema           (           fields           ,           dict_schema           )


    


@           staticmethod


def           map_fields_dict_schema           (           fields           ,           dict_schema           )           :


               """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name':0, 'age':1}，那么就返回{'name':'a','age':'b'}


    @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到


    @param dict_schema: 一个词典，key是字段名称，value是字段的位置；


    @return: 词典，key是字段名称，value是字段值


    """


               pdict           =           {           }


               for           fstr           ,           findex            in           dict_schema           .           iteritems           (           )           :


                   pdict           [           fstr           ]           =           str           (           fields           [           int           (           findex           )           ]           )


               return           pdict

Python操作MySQL视频教程

给大家带来自己制作的Python操作MySQL视频教程。本教程分为三节：Python开发环境搭建以及支持MySQL开发的插件安装、Python访问MySQL数据库的标准API规范接口讲解、Python开发MySQL程序实战编码演示。通过课程的学习，大家能够基本掌握用Python开发MySQL程序。

视频高清版百度链接: http://pan.baidu.com/s/1DB0qM 密码: ri1n

Python操作MySQL视频教程第一讲 – 开发环境搭建

推荐使用以下的开发环境搭配：

Eclipse + JDK7

插件：PyDev 3.8.0

python-2.7.8

插件：MySQL-python-1.2.4b4.win32-py2.7

MySQL服务器：使用wampserver2.5软件包自带的MySQL软件

需要安装：vcredist_x64
Mysql-5.6.17

本视频在优酷的地址：http://v.youku.com/v_show/id_XODE3Nzk4MTEy.html

Python操作MySQL视频教程第二讲 – 标准接口规范

第二讲的视频教程讲解的主要内容是：

Python官方针对操作数据库的标准规范

文档地址：http://legacy.python.org/dev/peps/pep-0249/

Python建立和数据库的connect连接对象

connection对象的构造函数，包括主机、端口、用户名、密码、编码等参数
connection对象的方法，主要是关闭连接、获取游标、提交事务、回滚事务

Python执行SQL语句的cursor对象

普通游标和字典游标的区别，以及字典游标优于普通游标的原因
游标执行SQL语句的方法
游标获取执行SQL语句结果集合的方法

Python编写访问数据库程序的框架，主要包括以下步骤：

导入MySQLdb对象
获取connection对象
获取普通游标或者字典游标
执行SQL语句
从游标对象中取出数据，对数据做其他处理；
关闭连接

视频在优酷的地址：http://v.youku.com/v_show/id_XODIxNzQ1MjQ0.html

Python操作MySQL视频教程第三讲 – 实例代码演示

第三讲的视频教程讲解的主要内容是：

Python编写MySQL程序的框架

引入模块：import MySQLdb
获取连接：conn = MySQLdb.connect()
获取游标：cursor = conn.cursor()
执行SQL：cursor.execute()
获取数据：curosr.fetchall()
关闭连接：conn.close()

MySQL的Innodb和Myisam引擎的区别

innodb支持事务，myisam不支持事务
如果访问的是innodb数据库，并执行了insert、delete、update语句，python代码中必须执行conn.commit()才能使得SQL执行生效

视频在优酷的地址：http://v.youku.com/v_show/id_XODI4MjE4Njgw.html

本文的代码和PPT在git上的地址：

本文的高清视频随后会发布在百度网盘，敬请期待。

本文地址：http://www.crazyant.net/1664.html ，转载请注明来源。

Python批量重命名文件的方法

用到了os的两个接口：

1、列出文件夹中的所有文件（也包含目录）

os.listdir(path)
Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.
Availability: Unix, Windows.
Changed in version 2.3: On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects

2、对文件进行重命名

os.rename(src, dst)
Rename the file or directory src to dst. If dst is a directory, OSError will be raised. On Unix, if dst exists and is a file, it will be replaced silently if the user has permission. The operation may fail on some Unix flavors if src and dst are on different filesystems. If successful, the renaming will be an atomic operation (this is a POSIX requirement). On Windows, if dst already exists, OSError will be raised even if it is a file; there may be no way to implement an atomic rename when dst names an existing file.
Availability: Unix, Windows

import            os


                      


dirpath           =           "D:/workbench/crazyant.net/myfiles"


for           fname            in           os           .           listdir           (           dirpath           )           :


               newfname           =           fname           [           3           :           ]


               newfpath           =           "%s/%s"           %           (           dirpath           ,           newfname           )


               oldfpath           =           "%s/%s"           %           (           dirpath           ,           fname           )


    


               os           .           rename           (           oldfpath           ,           newfpath           )

其实就是用os.listdir读取里面所有的文件，然后用os.rename进行文件重命名即可实现。

python的os模块官方介绍：http://docs.python.org/2/library/os.html

Python内置函数map、reduce、filter在文本处理中的应用

文件是由很多行组成的，这些行组成一个列表，python提供了处理列表很有用的三个函数：map、reduce、filter。因此在文本处理中，可以使用这三个函数达到代码的更加精简清晰。

这里的map、reduce是python的内置函数，跟hadoop的map、reduce函数没有关系，不过使用的目的有点类似，map函数做预处理、reduce函数一般做聚合。

map、reduce、filter在文本处理中的使用

下面是一个文本文件的内容，第1列是ID，第4列是权重，我们的目标是获取所有ID是奇数的行，将这些行的权重翻倍，最后返回权重值的总和。

ID	键	值	权重
1	name1	value1	11
2	name2	value2	12
3	name3	value3	13
4	name4	value4	14
5	name5	value5	15
6	name6	value6	16
7	name7	value7	17
8	name8	value8	18
9	name9	value9	19
10	name10	value10	20

使用filter、map、reduce函数的代码如下；

#coding=utf8


                      


''           '


Created on 2013-12-15


                      


@author: www.crazyant.net


'           ''


import            pprint


                      


def            read_file           (           file_path           )           :


               ''           '


            读取文件的每一行，按\t分割后返回字段列表；


    '           ''


               with            open           (           file_path           ,           "r"           )           as           fp           :


                   for           line            in           fp           :


                       fields           =           line           [           :           -           1           ]           .           split           (           "\t"           )


                       yield            fields


               fp           .           close           (           )


                      


def            is_even_lines           (           fields           )           :


               ''           '


            判断该行是否第一列的数字为偶数；


    '           ''


               return           int           (           fields           [           0           ]           )           %           2           ==           0


                      


def            double_weights           (           fields           )           :


               ''           '


            将每一行的权重这一字段的值翻倍


    '           ''


               fields           [           -           1           ]           =           int           (           fields           [           -           1           ]           )           *           2


               return           fields


                      


def            sum_weights           (           sum_value           ,           fields           )           :


               ''           '


            累加数字x到数字sum_value上面；


            返回新的sum_value值；


    '           ''


               sum_value           +=           int           (           fields           [           -           1           ]           )


               return           sum_value


                      


if           __name__           ==           "__main__"           :


               #读取文件中的所有行


               file_lines           =           [           x           for           x           in           read_file           (           "test_data"           )           ]


               print           '文件中原始的行：'


               pprint           .           pprint           (           file_lines           )


    


               print           '----'


    


               #过滤掉ID为偶数的行


               odd_lines           =           filter           (           is_even_lines           ,           file_lines           )


               print           '过滤掉ID为偶数的行：'


               pprint           .           pprint           (           odd_lines           )


    


               print           '----'


    


               #将每行的权重值翻倍


               double_weights_lines           =           map           (           double_weights           ,           odd_lines           )


               print           '将每行的权重值翻倍：'


               pprint           .           pprint           (           double_weights_lines           )


    


               print           '----'


    


               #计算所有的权重值的和


               #由于传给sum函数的每个元素都是一个列表，所以需要先提供累加的初始值，这里指定为0


               sum_val           =           reduce           (           sum_weights           ,           double_weights_lines           ,           0           )


               print           '计算每行权重值的综合：'


               print            sum           _val

运行结果：

文件中原始的行：          


[           [           '1'           ,           'name1'           ,           'value1'           ,           '11'           ]           ,


[           '2'           ,           'name2'           ,           'value2'           ,           '12'           ]           ,


[           '3'           ,           'name3'           ,           'value3'           ,           '13'           ]           ,


[           '4'           ,           'name4'           ,           'value4'           ,           '14'           ]           ,


[           '5'           ,           'name5'           ,           'value5'           ,           '15'           ]           ,


[           '6'           ,           'name6'           ,           'value6'           ,           '16'           ]           ,


[           '7'           ,           'name7'           ,           'value7'           ,           '17'           ]           ,


[           '8'           ,           'name8'           ,           'value8'           ,           '18'           ]           ,


[           '9'           ,           'name9'           ,           'value9'           ,           '19'           ]           ,


[           '10'           ,           'name10'           ,           'value10'           ,           '20'           ]           ]


--           --


ID为偶数的行：          


[           [           '2'           ,           'name2'           ,           'value2'           ,           '12'           ]           ,


[           '4'           ,           'name4'           ,           'value4'           ,           '14'           ]           ,


[           '6'           ,           'name6'           ,           'value6'           ,           '16'           ]           ,


[           '8'           ,           'name8'           ,           'value8'           ,           '18'           ]           ,


[           '10'           ,           'name10'           ,           'value10'           ,           '20'           ]           ]


--           --


           将每行的权重值翻倍：          


[           [           '2'           ,           'name2'           ,           'value2'           ,           24           ]           ,


[           '4'           ,           'name4'           ,           'value4'           ,           28           ]           ,


[           '6'           ,           'name6'           ,           'value6'           ,           32           ]           ,


[           '8'           ,           'name8'           ,           'value8'           ,           36           ]           ,


[           '10'           ,           'name10'           ,           'value10'           ,           40           ]           ]


--           --


           计算每行权重值的综合：          


160

map、reduce、filter函数的特点

filter函数：以列表为参数，返回满足条件的元素组成的列表；类似于SQL中的where a=1
map函数：以列表为参数，对每个元素做处理，返回这些处理后元素组成的列表；类似于sql中的select a*2
reduce函数：以列表为参数，对列表进行累计、汇总、平均等聚合函数；类似于sql中的select sum(a),average(b)

这些函数官方的解释

map(function, iterable, …)

Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

reduce(function, iterable[, initializer])

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned. Roughly equivalent to:

def reduce(function, iterable, initializer=None):
it = iter(iterable)
if initializer is None:
try:
initializer = next(it)
except StopIteration:
raise TypeError(‘reduce() of empty sequence with no initial value’)
accum_value = initializer
for x in it:
accum_value = function(accum_value, x)
return accum_value

filter(function, iterable)

Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.

See itertools.ifilter() and itertools.ifilterfalse() for iterator versions of this function, including a variation that filters for elements where the function returns false.

参考资料：

http://docs.python.org/2/library/functions.html

mysql根据A表更新B表的方法

最近遇到一个需求：mysql中A表和B表都有(id, age)字段，现在想读取B表的age字段，将其update到A表对应ID的age字段中去，我直接想到了一种方案：用Python读取B表，获得{id:age}形式的数据，然后根据每个ID和age的值依次update A表。

两个表分别定义和数据如下：

A表定义：

Field	Type	Comment
id	int(11)
name	varchar(20)
age	int(11)

数据：

1,name1,0
2,name2,0
3,name3,0
4,name4,0
5,name5,0

B表定义

Field	Type	Comment
id	int(11)
age	int(11)

数据：

1,11
2,21
3,31
4,41
5,51

python代码来实现

# -*- encoding:utf8 -*-
'''
@author: crazyant.net
读取B表的(id, age)数据，然后依次更新A表；
'''
from common.DBUtil import DB
dbUtil = DB('127.0.0.1',3306,'root','','test')
rs = dbUtil.query("SELECT id,age FROM table_b")
for row in rs:
(idv,age)=row
print (idv,age)
update_sql="update table_a set age='%s' where id='%s';"%(age,idv)
print update_sql
dbUtil.update(update_sql)
print 'over'

其实一条SQL语句就可以搞定

看了看代码，实在是简单，于是网上搜了一下mysql能不能根据一个表更新另一个表，结果发现update本身就支持多个表更新的功能。

UPDATE table_a,table_b SET table_a.age=table_b.age WHERE table_a.id=table_b.id;

用python代码就显得是大炮打蚊子多次一举了。

[织梦DEDE迁移]读取织梦MySQL生成所有文章链接

广告：本人承接迁移织梦到wordpress的业务.

本文阐述了从织梦的Mysql数据库读取数据表，生成所有文章链接的方法。

本文中使用了封装了Mysql常用函数的一个模块DBUtil，代码见链接

1、确认链接的组成结构

这个信息记录在dede的分类表dede_arctype的namerule字段中；

执行SQL语句：SELECT namerule FROM dede_arctype;

会看到返回结果都是一个值（一般都没有修改）：{typedir}/{Y}/{M}{D}/{aid}.html

这意思是，链接由以下字段组成：

{typedir}：类型的目录，来源于dede_arctype的typedir字段；
{Y}{M}{D}：文章发布的时间，来源于dede_archives表的pubdate字段；
{aid}：文章ID，来源于dede_archives的ID字段；

2、读取Mysql，拼凑URL

大致过程：

读取mysql的dede_arctype表和dede_archives，得到所有链接信息（包括文章ID、类型名称、类型目录、标题、发布日期、自定义文件名）
对于每一个链接，根据第1步骤的介绍装备链接；
至此已经拿到了所有的链接ID、链接标题和链接URL。

# -*- encoding:utf8 -*-


from            common            import            DBUtil


import            pprint


import            datetime


                      


dbUtil           =           DBUtil           .           DB           (           '127.0.0.1'           ,           3306           ,           'root'           ,           ''           ,           'oiayafnq_lwqn'           )


                      


site_home_url           =           "http://www.crazyant.net"


                      


class           Link           (           )           :


               def            __init__           (           self           ,           p_linkid           ,           p_title           ,           p_linkurl           )           :


                   self           .           linkid           =           p_linkid


                   self           .           title           =           p_title


                   self           .           linkurl           =           p_linkurl


                      


               def            __str__           (           self           )           :


                   strv           =           "%s\n%s\n%s\n"           %           (           self           .           linkid           ,           self           .           title           ,           self           .           linkurl           )


                   return           strv


                      


class           DedeLinks           (           )           :


               def            __init__           (           self           )           :


                   self           .           allLinks           =           [           ]


                      


               def            getDbArticlesInfo           (           self           )           :


                   ''           '


                        获取数据库中链接的信息以及对应的分类


        '           ''


                   rs           =           dbUtil           .           query           (           ''           '


                    SELECT 


                        dede_archives.id,dede_arctype.typename,dede_arctype.typedir,typeid,title,pubdate,filename


                    FROM 


                        dede_archives,dede_arctype 


                    WHERE dede_archives.typeid=dede_arctype.id;


                '           ''           )


                   return           rs


                      


               def            equipLink           (           self           ,           typedir           ,           urldate           ,           filename           ,           linkid           )           :


                   ''           '


                        根据分类目录、发布文章日期、自定义连接名（可以为空），链接ID，拼接成一个URL


        '           ''


                   article_date           =           str           (           datetime           .           date           .           fromtimestamp           (           urldate           )           )           .           replace           (           "-"           ,           ""           )


                   #print filename


                   link_dir           =           "%s/%s/%s"           %           (           typedir           ,           article_date           [           :           4           ]           ,           article_date           [           4           :           ]           )


                   if           filename           .           strip           (           )           !=           ""           :


                       link           =           "%s/%s.html"           %           (           link_dir           ,           filename           )


                   else           :


                       link           =           "%s/%s.html"           %           (           link_dir           ,           linkid           )


                   link           =           link           .           replace           (           "{cmspath}"           ,           site_home_url           )


                   return           link


                      


               def            getAllDedeLinks           (           self           )           :


                   rs           =           self           .           getDbArticlesInfo           (           )


                   for           row            in           rs           :


                       (           linkid           ,           typename           ,           typedir           ,           typeid           ,           title           ,           pubdate           ,           filename           )           =           row


                       linkurl           =           self           .           equipLink           (           typedir           ,           pubdate           ,           filename           ,           linkid           )


                       linkNode           =           Link           (           linkid           ,           title           ,           linkurl           )


                       self           .           allLinks           .           append           (           linkNode           )


                      


               def            process           (           self           )           :


                   self           .           getAllDedeLinks           (           )


                      


if           __name__           ==           "__main__"           :


               dlinks           =           DedeLinks           (           )


               dlinks           .           process           (           )


               for           linkNode            in           dlinks           .           allLinks           :


                   print            linkNode

其他模块可以访问该模块，采用dlinks.allLinks来访问所有的链接，其中的每个列表元素均包括链接ID、链接标题和链接URL。

Python访问MySQL封装的常用类

python访问mysql比较简单，细节请参考我的另一篇文章：链接

自己平时也就用到两个mysql函数：查询和更新，下面是自己常用的函数的封装，大家拷贝过去直接可以使用。

文件名：DBUtil.py

# -*- encoding:utf8 -*-


''           '


@author: crazyant.net


@version: 2013-10-22


                      


封装的mysql常用函数


'           ''


                      


import            MySQLdb


                      


class           DB           (           )           :


               def            __init__           (           self           ,           DB_HOST           ,           DB_PORT           ,           DB_USER           ,           DB_PWD           ,           DB_NAME           )           :


                   self           .           DB_HOST           =           DB_HOST


                   self           .           DB_PORT           =           DB_PORT


                   self           .           DB_USER           =           DB_USER


                   self           .           DB_PWD           =           DB_PWD


                   self           .           DB_NAME           =           DB_NAME


        


                   self           .           conn           =           self           .           getConnection           (           )


                      


               def            getConnection           (           self           )           :


                   return           MySQLdb           .           Connect           (


                                      host           =           self           .           DB_HOST           ,           #设置MYSQL地址


                                      port           =           self           .           DB_PORT           ,           #设置端口号


                                      user           =           self           .           DB_USER           ,           #设置用户名


                                      passwd           =           self           .           DB_PWD           ,           #设置密码


                                      db           =           self           .           DB_NAME           ,           #数据库名


                                      charset           =           'utf8'           #设置编码


                                      )


                      


               def            query           (           self           ,           sqlString           )           :


                   cursor           =           self           .           conn           .           cursor           (           )


                   cursor           .           execute           (           sqlString           )


                   returnData           =           cursor           .           fetchall           (           )


                   cursor           .           close           (           )


                   self           .           conn           .           close           (           )


                   return           returnData


    


               def            update           (           self           ,           sqlString           )           :


                   cursor           =           self           .           conn           .           cursor           (           )


                   cursor           .           execute           (           sqlString           )


                   self           .           conn           .           commit           (           )


                   cursor           .           close           (           )


                   self           .           conn           .           close           (           )


                      


if           __name__           ==           "__main__"           :


               db           =           DB           (           '127.0.0.1'           ,           3306           ,           'root'           ,           ''           ,           'wordpress'           )


               print            db           .           query           (           "show tables;"           )

使用方法为文件下面的main函数，使用query执行select语句并获取结果；或者使用update进行insert、delete等操作。

python执行shell的两种方法

有两种方法可以在Python中执行SHELL程序，方法一是使用Python的commands包，方法二则是使用subprocess包，这两个包均是Python现有的内置模块。

使用python内置commands模块执行shell

commands对Python的os.popen()进行了封装，使用SHELL命令字符串作为其参数，返回命令的结果数据以及命令执行的状态；

该命令目前已经废弃，被subprocess所替代；

Python

# coding=utf-8


'''


Created on 2013年11月22日


                      


@author: crazyant.net


'''


import           commands


import           pprint


                      


def           cmd_exe           (           cmd_String           )           :


               print           "will exe cmd,cmd:"           +           cmd_String


               return           commands           .           getstatusoutput           (           cmd_String           )


                      


if           __name__           ==           "__main__"           :


               pprint           .           pprint           (           cmd_exe           (           "ls -la"           )           )

使用python最新的subprocess模块执行shell

Python目前已经废弃了os.system，os.spawn*，os.popen*，popen2.*，commands.*来执行其他语言的命令，subprocesss是被推荐的方法；

subprocess允许你能创建很多子进程，创建的时候能指定子进程和子进程的输入、输出、错误输出管道，执行后能获取输出结果和执行状态。

# coding=utf-8


''           '


Created on 2013年11月22日


                      


@author: crazyant.net


'           ''


import            shlex


import            datetime


import            subprocess


import            time


                      


def            execute_command           (           cmdstring           ,           cwd           =           None           ,           timeout           =           None           ,           shell           =           False           )           :


               ""           "执行一个SHELL命令


            封装了subprocess的Popen方法, 支持超时判断，支持读取stdout和stderr


           参数:


        cwd: 运行命令时更改路径，如果被设定，子进程会直接先更改当前路径到cwd


        timeout: 超时时间，秒，支持小数，精度0.1秒


        shell: 是否通过shell运行


    Returns: return_code


    Raises:  Exception: 执行超时


    "           ""


               if           shell           :


                   cmdstring_list           =           cmdstring


               else           :


                   cmdstring_list           =           shlex           .           split           (           cmdstring           )


               if           timeout           :


                   end_time           =           datetime           .           datetime           .           now           (           )           +           datetime           .           timedelta           (           seconds           =           timeout           )


    


               #没有指定标准输出和错误输出的管道，因此会打印到屏幕上；


               sub           =           subprocess           .           Popen           (           cmdstring_list           ,           cwd           =           cwd           ,           stdin           =           subprocess           .           PIPE           ,           shell           =           shell           ,           bufsize           =           4096           )


    


               #subprocess.poll()方法：检查子进程是否结束了，如果结束了，设定并返回码，放在subprocess.returncode变量中 


               while           sub           .           poll           (           )           is           None           :


                   time           .           sleep           (           0.1           )


                   if           timeout           :


                       if           end_time           <=           datetime           .           datetime           .           now           (           )           :


                           raise            Exception           (           "Timeout：%s"           %           cmdstring           )


            


               return           str           (           sub           .           returncode           )


                      


if           __name__           ==           "__main__"           :


               print            execute_command           (           "ls"           )

也可以在Popen中指定stdin和stdout为一个变量，这样就能直接接收该输出变量值。

总结

在python中执行SHELL有时候也是很必须的，比如使用Python的线程机制启动不同的shell进程，目前subprocess是Python官方推荐的方法，其支持的功能也是最多的，推荐大家使用。

Python封装的常用日期函数

处理日志数据时，经常要对日期进行进行计算，比如日期加上天数、日期相差天数、日期对应的周等计算，本文收集了几个常用的python日期功能函数，一直更新中。

直接贴代码（文件名DateUtil.py），函数功能可以直接查看注释：

# -*- encoding:utf8 -*-


''           '


@author: crazyant


@version: 2013-10-12


'           ''


import            datetime           ,           time


                      


#定义的日期的格式，可以自己改一下，比如改成"$Y年$m月$d日"


format_date           =           "%Y-%m-%d"


format_datetime           =           "%Y-%m-%d %H:%M:%S"


                      


def            getCurrentDate           (           )           :


               ''           '


            获取当前日期：2013-09-10这样的日期字符串


    '           ''


               return           time           .           strftime           (           format_date           ,           time           .           localtime           (           time           .           time           (           )           )           )


                      


def            getCurrentDateTime           (           )           :


               ''           '


            获取当前时间：2013-09-10 11:22:11这样的时间年月日时分秒字符串


    '           ''


               return           time           .           strftime           (           format_datetime           ,           time           .           localtime           (           time           .           time           (           )           )           )


                      


def            getCurrentHour           (           )           :


               ''           '


            获取当前时间的小时数，比如如果当前是下午16时，则返回16


    '           ''


               currentDateTime           =           getCurrentDateTime           (           )


               return           currentDateTime           [           -           8           :           -           6           ]


                      


def            getDateElements           (           sdate           )           :


               ''           '


            输入日期字符串，返回一个结构体组，包含了日期各个分量


            输入：2013-09-10或者2013-09-10 22:11:22


            返回：time.struct_time(tm_year=2013, tm_mon=4, tm_mday=1, tm_hour=21, tm_min=22, tm_sec=33, tm_wday=0, tm_yday=91, tm_isdst=-1)


    '           ''


               dformat           =           ""


               if           judgeDateFormat           (           sdate           )           ==           0           :


                   return           None


               elif            judgeDateFormat           (           sdate           )           ==           1           :


                   dformat           =           format_date


               elif            judgeDateFormat           (           sdate           )           ==           2           :


                   dformat           =           format_datetime


               sdate           =           time           .           strptime           (           sdate           ,           dformat           )


               return           sdate


                      


def            getDateToNumber           (           date1           )           :


               ''           '


            将日期字符串中的减号冒号去掉: 


            输入：2013-04-05，返回20130405


            输入：2013-04-05 22:11:23，返回20130405221123


    '           ''


               return           date1           .           replace           (           "-"           ,           ""           )           .           replace           (           ":"           ,           ""           )           .           replace           (           ""           ,           ""           )


                      


def            judgeDateFormat           (           datestr           )           :


               ''           '


            判断日期的格式，如果是"%Y-%m-%d"格式则返回1，如果是"%Y-%m-%d %H:%M:%S"则返回2，否则返回0


            参数 datestr:日期字符串


    '           ''


               try           :


                   datetime           .           datetime           .           strptime           (           datestr           ,           format_date           )


                   return           1


               except           :


                   pass


                      


               try           :


                   datetime           .           datetime           .           strptime           (           datestr           ,           format_datetime           )


                   return           2


               except           :


                   pass


                      


               return           0


                      


def            minusTwoDate           (           date1           ,           date2           )           :


               ''           '


            将两个日期相减，获取相减后的datetime.timedelta对象


            对结果可以直接访问其属性days、seconds、microseconds


    '           ''


               if           judgeDateFormat           (           date1           )           ==           0           or           judgeDateFormat           (           date2           )           ==           0           :


                   return           None


               d1Elements           =           getDateElements           (           date1           )


               d2Elements           =           getDateElements           (           date2           )


               if           not           d1Elements            or           not           d2Elements           :


                   return           None


               d1           =           datetime           .           datetime           (           d1Elements           .           tm_year           ,           d1Elements           .           tm_mon           ,           d1Elements           .           tm_mday           ,           d1Elements           .           tm_hour           ,           d1Elements           .           tm_min           ,           d1Elements           .           tm_sec           )


               d2           =           datetime           .           datetime           (           d2Elements           .           tm_year           ,           d2Elements           .           tm_mon           ,           d2Elements           .           tm_mday           ,           d2Elements           .           tm_hour           ,           d2Elements           .           tm_min           ,           d2Elements           .           tm_sec           )


               return           d1           -           d2


                      


def            dateAddInDays           (           date1           ,           addcount           )           :


               ''           '


            日期加上或者减去一个数字，返回一个新的日期


            参数date1：要计算的日期


            参数addcount：要增加或者减去的数字，可以为1、2、3、-1、-2、-3，负数表示相减


    '           ''


               try           :


                   addtime           =           datetime           .           timedelta           (           days           =           int           (           addcount           )           )


                   d1Elements           =           getDateElements           (           date1           )


                   d1           =           datetime           .           datetime           (           d1Elements           .           tm_year           ,           d1Elements           .           tm_mon           ,           d1Elements           .           tm_mday           )


                   datenew           =           d1           +           addtime


                   return           datenew           .           strftime           (           format_date           )


               except            Exception            as           e           :


                   print           e


                   return           None


                      


def            is_leap_year           (           pyear           )           :


               ''           '


            判断输入的年份是否是闰年 


    '           ''              


               try           :                                


                   datetime           .           datetime           (           pyear           ,           2           ,           29           )


                   return           True                     


               except            ValueError           :                  


                   return           False                    


                      


def            dateDiffInDays           (           date1           ,           date2           )           :


               ''           '


            获取两个日期相差的天数，如果date1大于date2，返回正数，否则返回负数


    '           ''


               minusObj           =           minusTwoDate           (           date1           ,           date2           )


               try           :


                   return           minusObj           .           days


               except           :


                   return           None


                      


def            dateDiffInSeconds           (           date1           ,           date2           )           :


               ''           '


            获取两个日期相差的秒数


    '           ''


               minusObj           =           minusTwoDate           (           date1           ,           date2           )


               try           :


                   return           minusObj           .           days *           24           *           3600           +           minusObj           .           seconds


               except           :


                   return           None


                      


def            getWeekOfDate           (           pdate           )           :


               ''           '


            获取日期对应的周，输入一个日期，返回一个周数字，范围是0~6、其中0代表周日


    '           ''


               pdateElements           =           getDateElements           (           pdate           )


                      


               weekday           =           int           (           pdateElements           .           tm_wday           )           +           1


               if           weekday           ==           7           :


                   weekday           =           0


               return           weekday


                      


if           __name__           ==           "__main__"           :


               ''           '


            一些测试代码


    '           ''


               print            judgeDateFormat           (           "2013-04-01"           )


               print            judgeDateFormat           (           "2013-04-01 21:22:33"           )


               print            judgeDateFormat           (           "2013-04-31 21:22:33"           )


               print            judgeDateFormat           (           "2013-xx"           )


               print           "--"


               print            datetime           .           datetime           .           strptime           (           "2013-04-01"           ,           "%Y-%m-%d"           )


               print           'elements'


               print            getDateElements           (           "2013-04-01 21:22:33"           )


               print           'minus'


               print            minusTwoDate           (           "2013-03-05"           ,           "2012-03-07"           )           .           days


               print            dateDiffInSeconds           (           "2013-03-07 12:22:00"           ,           "2013-03-07 10:22:00"           )


               print            type           (           getCurrentDate           (           )           )


               print            getCurrentDateTime           (           )


               print            dateDiffInSeconds           (           getCurrentDateTime           (           )           ,           "2013-06-17 14:00:00"           )


               print            getCurrentHour           (           )


               print            dateAddInDays           (           "2013-04-05"           ,           -           5           )


               print            getDateToNumber           (           "2013-04-05"           )


               print            getDateToNumber           (           "2013-04-05 22:11:33"           )


                      


               print            getWeekOfDate           (           "2013-10-01"           )

标签：常用,return,python,fields,self,案例,dict,print,def
From： https://blog.51cto.com/u_6186189/7049740

python

Python中文转拼音代码(支持全拼和首字母缩写)

Python使用list字段模式或者dict字段模式读取文件的方法

读取文件，按照分隔符分割成字段数据列表

映射到模型之方法1：使用配置好的字典模式，装配读取的数据列表

映射到模型之方法2：使用配置好的列表模式，装配读取的数据列表

file_util.py全部代码

Python操作MySQL视频教程

Python操作MySQL视频教程第一讲 – 开发环境搭建

Python操作MySQL视频教程第二讲 – 标准接口规范

Python操作MySQL视频教程第三讲 – 实例代码演示

Python批量重命名文件的方法

Python内置函数map、reduce、filter在文本处理中的应用

map、reduce、filter在文本处理中的使用

map、reduce、filter函数的特点

这些函数官方的解释

map(function, iterable, …)

reduce(function, iterable[, initializer])

filter(function, iterable)

参考资料：

mysql根据A表更新B表的方法

两个表分别定义和数据如下：

A表定义：

python代码来实现

其实一条SQL语句就可以搞定

[织梦DEDE迁移]读取织梦MySQL生成所有文章链接

Python访问MySQL封装的常用类

python执行shell的两种方法

使用python内置commands模块执行shell

使用python最新的subprocess模块执行shell

总结

Python封装的常用日期函数

相关文章

赞助商

阅读排行