sscanf，scanf，fscanf与正则表达式

表头文件 #include(stdio.h )

定义函数 int sscanf (const char *str,const char * format,……..);
函数说明 sscanf()会将参数str的字符串根据参数format字符串来转换并格式化数据。格式转换形式请参考scanf()。转换后的结果存于对应的参数内。
返回值成功则返回参数数目，失败则返回-1，错误原因存于errno中。

# include < stdio. h>
int main( )
{
      const char * s = "iios/12DDWDFF@122" ;
      char buf[ 20] ;
      sscanf ( s, "%*[^/]/%[^@]" , buf ) ;
      printf ( "%s\n" , buf ) ;
      return 0;
}

结果为: 12DDWDFF
sscanf与scanf类似，都是用于输入的，只是后者以屏幕( stdin ) 为输入源，前者以固定字符串为输入源。

函数原型：
int scanf( const char *format [,argument]… );
其中的format可以是一个或多个 {%[*] [width] [{h | l | I64 | L}]type | ‘ ‘ | ‘\t’ | ‘\n’ | 非%符号}，
注：{a|b|c}表示a,b,c中选一，[d],表示可以有d也可以没有d。
width:宽度，一般可以忽略，用法如：

const char sourceStr[] = "hello, world";
char buf[10] = {0};
sscanf(sourceStr, "%5s", buf); //%5s，只取5个字符
cout << buf<< endl;

结果为:hello
{h | l | I64 | L}:参数的size,通常h表示单字节size，I表示2字节 size,L表示4字节size(double例外),l64表示8字节size。
type :这就很多了，就是%s,%d之类。

特别的：
%*[width] [{h | l | I64 | L}]type 表示满足该条件的被过滤掉，不会向目标参数中写入值。如：

const char sourceStr[] = "hello, world";
char buf[10] = {0};
sscanf(sourceStr, "%*s%s", buf); //%*s表示第一个匹配到的%s被过滤掉，即hello被过滤了
cout << buf<< endl;

结果为:world

支持集合操作：
%[a-z] 表示匹配a到z中任意字符，贪婪性(尽可能多的匹配)
%[aB’] 匹配a、B、’中一员，贪婪性
%[^a] 匹配非a的任意字符，贪婪性

1. 常见用法。

以下是引用片段：

charstr[ 512] = { 0} ;
sscanf( "123456" , "%s" , str) ;
printf( "str=%s" , str) ;

2. 取指定长度的字符串。如在下例中，取最大长度为4字节的字符串。

以下是引用片段：

sscanf( "123456" , "%4s" , str) ;
printf( "str=%s" , str) ;

3. 取到指定字符为止的字符串。如在下例中，取遇到空格为止字符串。

以下是引用片段：

sscanf( "123456abcdedf" , "%[^]" , str) ;
printf( "str=%s" , str) ;

4. 取仅包含指定字符集的字符串。如在下例中，取仅包含1到9和小写字母的字符串。

以下是引用片段：

sscanf( "123456abcdedfBCDEF" , "%[1-9a-z]" , str) ;
printf( "str=%s" , str) ;

5. 取到指定字符集为止的字符串。如在下例中，取遇到大写字母为止的字符串。

以下是引用片段：

sscanf( "123456abcdedfBCDEF" , "%[^A-Z]" , str) ;
printf( "str=%s" , str) ;

搜集一些特殊用法：

% [ ] 的用法：% [ ] 表示要读入一个字符集合, 如果[ 后面第一个字符是”^”，则表示反意思。[ ] 内的字符串可以是1或更多字符组成。空字符集（% [ ] ）是违反规定的，可导致不可预知的结果。% [ ^ ] 也是违反规定的。

% [ a- z] 读取在 a- z 之间的字符串，如果不在此之前则停止，如

char s[ ] = "hello, my friend” ; // 注意: ,逗号在不 a-z之间
sscanf( s, “%[a-z]”, string ) ; // string=hello

%[^a-z] 读取不在 a-z 之间的字符串，如果碰到a-z之间的字符则停止，如

char s[]=" HELLOkitty” ; // 注意: ,逗号在不 a-z之间
sscanf ( s, “% [ ^ a- z] ”, string ) ; // string=HELLO

% * [ ^ = ] 前面带 * 号表示不保存变量。跳过符合条件的字符串。

char s[ ] = "notepad=1.0.0.1001" ;
char szfilename [ 32] = "" ;
int i = sscanf ( s, "%*[^=]" , szfilename ) ;
// szfilename=NULL,因为没保存
int i = sscanf ( s, "%*[^=]=%s" , szfilename ) ;
// szfilename=1.0.0.1001

% 40c 读取40个字符

% [ ^ = ] 读取字符串直到碰到’= ’号，’^’后面可以带更多字符, 如：

char s[ ] = "notepad=1.0.0.1001" ;
char szfilename [ 32] = "" ;
int i = sscanf ( s, "%[^=]" , szfilename ) ;
// szfilename=notepad

如果参数格式是：% [ ^ = : ] ，那么也可以从 notepad: 1. 0. 0. 1001读取notepad

应用实例：Rinex星历数据读入
以下是Rinex星历数据的片断：
2 10 1 16 2 0 0.0 2.159508876503D-04 4.320099833421D-12 0.000000000000D+00
9.500000000000D+01 1.165625000000D+01 5.293791936254D-09-3.076667279839D+00
7.841736078262D-07 9.284008061513D-03 4.578381776810D-06 5.153754629135D+03
5.256000000000D+05-1.527369022369D-07 1.733543031426D+00 1.303851604462D-08
9.397921470752D-01 2.780625000000D+02 2.941268779788D+00-8.269273019842D-09
1.892935991239D-10 1.000000000000D+00 1.566000000000D+03 0.000000000000D+00
2.000000000000D+00 0.000000000000D+00-1.722946763039D-08 9.500000000000D+01
5.184000000000D+05
观察可以发现，有的数据连在了一起，没有空格分隔，如第二行5.293791936254D-09-3.076667279839D+00。这时应该如何提取出相应的数据呢？代码如下：

#include <stdafx.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int _tmain(int argc, _TCHAR* argv[])
{
 int wn,year,month,day,hour,prn,minute;
 float second;
 char t1[30],t2[30],t3[30],t4[30],t5[30],temp[10];
 double f1,f2,f3,f4,d1,d2,d3,d4;
 FILE *RinexEPP_file;
  if ((RinexEPP_file = fopen("e:\data.txt", "rt")) == NULL) {
    fprintf(stderr, "Cannot open input file.\n");
    exit(1); }
 rewind(RinexEPP_file);
 if(fscanf(RinexEPP_file,"%d %d %d %d %d %d\n",
         &prn,&year,&month,&day,
         &hour,&minute)==EOF) return 1;
 fscanf(RinexEPP_file,"%4f",&second);
 fscanf(RinexEPP_file,"%[^D]",t1);  //读到D字符为止
 fscanf(RinexEPP_file,"%4s",temp);  //截取D后面的四个字符
 strcat(t1,temp);                   //将D前后两段相连
 d1=atof(t1);
 d2=atof(t2);
 d3=atof(t3);
 d4=atof(t4);
}

将其整理成一个函数如下：

char *myscanf(FILE *RinexEPP_file);

char *myscanf(FILE *RinexEPP_file)
{
 char t1[30],temp[10];
 fscanf(RinexEPP_file,"%[^D]",t1);  //一直读到为D字符为止
 fscanf(RinexEPP_file,"%4s",temp);
 strcat(t1,temp);
 return t1;
}

Kyle's Blog

听而不闻，视而不见，大智若愚，韬光养晦

sscanf，scanf，fscanf与正则表达式